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(54) System and method for purging database update image files after completion of associated 
transactions. 



(57) A primary computer system has a database, 
application programs that modify the local database, 
and a transaction manager that stores audit records in 
a local image trail reflecting those application program 
modifications to the local database. In a remote backup 
system, a Receiver process receives audit records from 
th3 primary system. The audit records include audit up- 
date and audit backout records indicating database up- 
dates and database backouts generated by transac- 
tions executing on the primary system. The Receiver 
stores the audit update and audit backout records in one 
or more image trails. For each image trail there is an 
Updater process that applies to a backup database vol- 
ume the database updates and backouts indicated by 



the audit update and audit backout records in the image 
trail. The remote backup system periodically executes 
a file purge procedure, which identifies the oldest trans- 
action table from among the transaction tables in the last 
image trail file accessed for each of the image trails. 
Then, for each image trail, the file purge procedure ac- 
cesses the image trial files in a predefined chronological 
order and for each accessed image trail file it compares 
a first set of newest transaction identifiers in the file's 
transaction table with a second set of oldest transaction 
identifiers in the identified oldest transaction table. The 
procedure purges the accessed image trail file only 
when all of the transaction identifiers in the first set are 
older than corresponding transaction identifiers in the 
second set. 
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Description 

[0001] This application is a continuation of, and claims 
priority on, U.S. provisional patent application serial no. 
60/118,770, filed February 4, 1999. 
[0002] The present invention relates generally to da- 
tabase management systems having a primary data- 
base facility and a duplicate or backup database facility, 
and particularly to a system and method for synchroniz- 
ing a backup database with a primary database while 
applications continue to actively modify the primary da- 
tabase. 

Background of the Invention 

[0003] The present invention is an improvement on 
the Tandem "remote data facility" (RDF) technology dis- 
closed in U.S. Patent Nos. 5,799,322 and 5,799,323, 
both issued August, 25, 1998, which is hereby incorpo- 
rated by reference as background information. 
[0004] The prior art Tandem RDF technology under- 
went a number of changes over time to increase the 
bandwidth of the system, where the bandwidth is indi- 
cated by the peak number of transactions per second 
that can be performed on the primary system and repli- 
cated on the backup system. The present invention rep- 
resents a set of new techniques so as achieve another 
large increase in bandwidth. Some of the techniques 
used by the present invention to increase bandwidth vi- 
olate basic assumptions of the prior art systems, requir- 
ing both redesign of prior art mechanisms, and the some 
completely new mechanisms, to ensure that the backup 
system maintains "soft synchronization" with the prima- 
ry during normal operation, and to also ensure that the 
backup system can be brought to an entirely consistent 
internal state whenever the backup system needs to 
perform at takeover operation and be used as the pri- 
mary system. 

SUMMARY OF THE INVENTION 

[0005] In summary, the present invention is a distrib- 
uted computer database system having a local compu- 
ter system and a remote computer system. The local 
computer system has a local database stored on local 
memory media, application programs that modify the lo- 
cal database, and a transaction manager that stores au- 
dit records in a local image trail reflecting those applica- 
tion program modifications to the local database as well 
as commit/abort records indicating which of the trans- 
actions making those database modifications commit- 
ted and which aborted. Each audit record has an asso- 
ciated audit trail position in the local image trail, other- 
wise referred to as a MAT (master audit trail) position. 
[0006] The remote computer system, remotely locat- 
ed from the local computer system, has a backup data- 
base stored on remote memory media associated with 
the remote computer system. 



[0007] A remote duplicate data facility (RDF) is par- 
tially located in the local computer system and partially 
in the remote computer for maintaining virtual synchro- 
nization of the backup database with the local database. 
5 The RDF includes an Extractor process executed on the 
local computer system, and a Receiver process and one 
or more Updater processes executed on the remote 
computer system. 

[0008] The Extractor process, executed on the local 
computer system, extracts audit records from the local 
image trail. It has a plurality of message buffers (four in 
the preferred embodiment) for buffering groups of the 
extracted audit records together and transmits each 
message buffer to the remote computer system when 
the buffer is full or a timeout occurs. 
[0009] The Receiver process, executed on the remote 
computer system, receives message buffers transmit- 
ted by the Extractor process and distributes the audit 
records in each received message buffer to one or more 
image trails in the remote computer system. The audit 
records include audit update and audit backout records 
indicating database updates and database backouts 
generated by transactions executing on the primary sys- 
tem. 

[0010] The Receiver process stores the audit update 
records in one or more image trails, and stores each im- 
age trail in a sequence of image trail files. The Receiver 
process also stores in each image trail file a transaction 
table representing a range of transaction identifiers for 
transactions potentially pending in the primary system 
at the time that the first audit record in the image trail 
file was generated by the primary system. 
[001 1] For each image trail there is an Updater proc- 
ess that applies to a backup database volume the data- 
base updates and backouts indicated by the audit up- 
date and audit backout records in the image trail. The 
audit update and audit backout records are applied to 
the backup database volume in same order that they 
are stored in the image trail, without regard to whether 
corresponding transactions in the primary system com- 
mitted or aborted. 

[0012] The remote computer system periodically ex- 
ecutes a file purge procedure which identifies the oldest 
transaction table from among the transaction tables in 
the last image trail file accessed for each of the image 
trails. Then, for each image trail, the file purge procedure 
accesses the image trial file in a predefined chronolog- 
ical order and for each accessed image trail file it com- 
pares a first set of newest transaction identifiers in the 
file's transaction table with a second set of oldest trans- 
action identifiers in the identified oldest transaction ta- 
ble. The procedure purges the accessed image trail file 
only when all of the transaction identifiers in the first set 
are older than corresponding transaction identifiers in 
the second set. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

[0013] Additional objects and features of the invention 
will be more readily apparent from the following detailed 
description and appended claims when taken in con- 
junction with the drawings, in which: 

Fig. 1 is a block diagram of a prior art database man- 
agement system with a remote duplicate database 
facility. 

Fig. 2 is a conceptual representation of the check- 
point, context save, and failover procedures used 
by the system shown in Fig. 1 . 

Fig. 3 is a schematic representation of the configu- 
ration file used to define the configuration of each 
RDF system in a preferred embodiment. 

Fig. 4 is block diagram of a database management 
system having a plurality of parallel remote dupli- 
cate database facilities. 

Figs. 5A and 5B depict data structures used by the 
Extractor process a preferred embodiment of the 
present invention. 

Figs. 6A, 6B, 6C, 6D and 6E are flowcharts of pro- 
cedures executed by the Extractor process in a pre- 
ferred embodiment of the present invention. 

Figs. 7A is a block diagram of a Receiver context 
record. Fig. 7B is a block diagram of a set of image 
trail context records. Figs. 7C, 7D, 7E, 7F, 7G and 
7H are block diagrams of data structures used by 
the Receiver and Purger processes in a preferred 
embodiment of the present invention. 

Figs. 8A, 8B, 8C, 8D t 8E, 8F, 8G, 8H. 81, 8J and 8K 
are flowcharts of procedures executed by the Re- 
ceiver process in a preferred embodiment of the 
present invention. 

Fig. 9 is a block diagram of data structures, stored 
in primary memory, used by each Updater process 
in a preferred embodiment of the present invention. 

Figs. 10A, 10B, 10C, 10D and 10E are flowcharts 
of procedures executed by the Updater processes 
in a preferred embodiment of the present invention. 

Fig. 1 1 depicts a flow chart of actions performed by 
a backup system when performing an RDF takeo- 
ver, so as to prepare the backup system to take over 
and operate as the primary system. 

FIG. 12 depicts a flow chart of the procedure for 
generating an Undo List of transactions whose final 



state is unknown. 

Fig. 13 depicts a transaction status table used in a 
preferred embodiment. 

5 

Figs. 14A and 14B depicts a flow chart of the Up- 
dater Undo procedure for backing out updates for 
incomplete transactions. 

Fig. 15 depicts a flow chart of a procedure for peri- 
odically purging image trail files no longer needed 
by the backup systems. 



Overview of RDF System 

[0014] Fig. 1 represents the basic architecture of Tan- 
dem Computer's RDF system, while Fig. 2 shows the 
relationship between some of the RDF processes and 
their respective local backup processes. In Tandem 
transaction processing systems each process has a re- 
spective local backup process that is automatically in- 
voked if the primary process fails. Each local backup 
process is located on a different CPU than its respective 
primary process, and provides a first level of fault pro- 
tection. A primary purpose of the RDF (remote data fa- 
cility) system is to handle failures in the primary system 
that cannot be resolved through the use of local backup 
processes (and other local remedial measures), such 
as a complete failure of the primary system. 
[0015] The computer system 100 shown in Fig. 1 has 
a transaction management facility 1 02 that writes audit 
entries to a master audit trail (MAT) 1 04. The audit en- 
tries indicate changes made to "audited files" on "RDF 
protected volumes" 106 of a primary database 108 on 
a primary system 110. All RDF protected volumes are 
configured to write all transaction audit records to the 
MAT 104. 

[0016] The RDF system 120 includes processes on 
both the local processors 1 1 0, 1 60 and the remote back- 
up processors 122, 1 62. The RDF 120 maintains a rep- 
licated database 1 24 (also called the backup database) 
by monitoring changes made to "audited files" on "RDF 
protected volumes" 1 06 on a primary system and apply- 
ing those changes to corresponding backup volumes 
126 on the backup computer system 122. An "audited 
file" (sometimes called an "RDF audited file") is a file for 
which RDF protection has been enabled, and an "RDF 
protected volume" is a logical or physical unit of disk 
storage for which RDF protection has been enabled. 
[0017] On the primary computer system 1 1 0, an RDF 
Extractor process 1 30 reads the master audit trail (MAT) 
104, which is a log maintained by the transaction man- 
agement facility (TM/MP) of all database transactions 
that affect audited files, and sends all logical audit 
records associated with RDF-protected volumes to an 
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RDF Receiver process 132 on the backup computer 
system. 

[001 8] The MAT 1 04 is stored as a series of files with 
sequentially numbered file names. The MAT files are all 
of a fixed size (configurable for each system), such as 5 
64 Mbytes. The TMF 102 and Extractor 130 both are 
programmed to progress automatically (and independ- 
ently) from one MAT file to the next. 

The Extractor Process - Overview w 

[001 9] Referring to Fig. 5A, the Extractor process 1 30 
adds a MAT position value 288 and a timestamp 290 to 
each audit record that it extracts from the master audit 
trail 1 04 and that is associated with a protected volume. 15 
The MAT position value is the position in the MAT of the 
extracted audit record. The added timestamp is known 
as the RTD timestamp, and is the timestamp of the last 
transaction to complete prior to generation of the audit 
record in the MAT 1 04. "Rie resulting record is called an 20 
audit image record, or image record 284. The Extractor 
process stores each audit image record in message 
buffers 242, each having a size of about 28K bytes in 
the preferred embodiment. 

[0020] The Extractor process uses two to eight mes- 25 
sage buffers 242, with four message buffers being a typ- 
ical configuration. After filling and transmitting a mes- 
sage buffer 242 to the Receiver process via a commu- 
nication channel 144 (Fig. 1), the Extractor process 130 
does not wait for an acknowledgment reply message 30 
from the Receiver process 132. Rather, as long another 
message buffer is available, it continues processing au- 
dit records in the MAT 104, storing audit image records 
in the next available message buffer 242. Each mes- 
sage buffer 242 is made unavailable after it is transmit- 35 
ted to the Receiver process 132 until a corresponding 
acknowledgment reply message is received from the 
Receiver process 132, at which point the message buff- 
er 142 becomes available for use by the Extractor proc- 
ess 130. 40 

[0021] The Extractor process 130 performs a single 
checkpoint operation during startup of the Extractor 
process, and that checkpoint 1 58 only sends a takeover 
location to the backup Extractor process 150. (See Fig. 
2.) It also does not durably store a context record. Rath- « 
er, the Extractor process 130 relies on information re- 
ceived from the Receiver process 132 when RDF is ei- 
ther starting up or restarting, as will be explained in more 
detail below, as well as during an RDF startup. 
[0022] Unlike prior art implementations, in the present so 
invention the Extractor sends to the backup system all 
logical audit for the protected volumes, including update 
audit representing changes to database records made 
by transactions, and backout audit records (also called 
undo audit records) for aborted transactions. For up- ss 
dates and backouts, the Extractor stores both the before 
and after images of the updated database record in the 
image records sent to the backup system to enable 
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these operations to be both redone and then reversed 
by the updaters 134. 

[0023] The Extractor 130 also sends all transaction 
state records and TM P control point records to the back- 
up system. There are five types of transaction state 
records: active, prepared, aborting, committed and 
aborted. TMP control point records indicate the bound- 
aries of transaction computing intervals (called control 
point intervals) in the primary system. Every transaction 
is guaranteed to generate at least one transaction state 
record during each control point interval in which it is 
active (i.e., alive), except possibly the control point in- 
terval in which the transaction starts. 
[0024] These transaction state and TM P control point 
records and their processing by the RDF system will be 
explained in more detail below. 

The Receiver Process - Overview 

[0025] The Receiver process 132 immediately ac- 
knowledges each received message buffer. No process- 
ing of the message buffer is performed before the ac- 
knowledgment is sent. The RDF system provides tight 
synchronization of the Extractor and Receiver process- 
es and provides for automatic resynchronization when- 
ever a start or restart condition occurs. For example the 
two processes will resynchronize whenever either proc- 
ess is restarted or has a primary process failure, and 
whenever the Receiver process receives audit records 
out of order from the Extractor process. 
[0026] The Receiver process 132 sorts received audit 
records such that (A) transaction state records, includ- 
ing commit/abort records, and control point records are 
stored only in the master image trail 136, and (B) each 
database update and backout audit record is moved into 
onfy one image trail 138 corresponding to the only Up- 
dater process 134 that will use that audit record to up- 
date data stored on a backup volume 126. 
[0027] While sorting and storing the received audit 
records, the Receiver process 132 determines the old- 
est and newest transactions active during each TMP 
control period for each processor of each node of the 
primary system, and stores this information in its context 
record. This "active transaction" information is also 
stored in the audit image trails used by the Updaters. 
The "active transaction" information is used to efficiently 
identify image trail files that can be purged because they 
are no longer needed by the system. 
[0028] Whenever the Receiver process receives a 
special "Stop Updaters" audit record, it copies that 
record into all the image trails. The Stop Updaters audit 
record, produced on the primary system 110 by special 
"online DDL" procedures, causes each Updater 134 to 
stop. Each Updater logs a message indicating that it has 
shut down because it read a particular Stop Updaters 
record. When all the Updaters have shot down in re- 
sponse to the same Stop Updaters record, the operator 
of the RDF should (A) perform the same DDL procedure 
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on the remote backup system as was performed by the 
online DDL procedure on the primary system, and then 
(B) to re-start the Updaters. This procedure is used to 
ensure continued virtual synchronization of the local and 
remote database when "online DDL" procedures are 
used to restructure database objects on the primary sys- 
tem with minimal interruption of user access to the da- 
tabase objects being restructured. 
[0029] The Receiver process performs a single 
checkpoint operation during startup of the Receiver 
process, and that checkpoint 1 64 only sends a takeover 
location to the backup Receiver process 152. (See Fig. 
2.) However, is does periodically (e.g., once every 5 to 
15 seconds) durably store a Receiver context record 
270 and a set of Image Trail context records 271 on a 
nonvolatile (disk) storage device 172 (see Figs. 7Aand 
7B). 

Purger Process - Overview 

[0030] The Purger process periodically deletes image 
trail files that are not needed, even in the event of a take- 
over. Because the updaters apply audit to the backup 
database even for transactions whose outcome is un- 
known, the Purger only deletes image trail files all of 
whose audit records correspond to transactions whose 
outcome is known to the backup system. 

Updater Processes - Overview 

[0031 ] Each RDF-protected volume 1 06 on the prima- 
ry computer system 110 has its own Updater process 
1 34 on the backup computer system 1 22 that is respon- 
sible for applying audit image records to the correspond- 
ing backup volume 126 on the backup computer system 
122 so as to replicate the audit protected files on that 
volume. Audit image records associated with both com- 
mitted and aborted transactions on the primary system 
are applied to the database on the remote backup com- 
puter system 122. In the present invention, no attempt 
is made to avoid applying aborted transactions to the 
backup database, because it has been determined that 
it is much more efficient to apply both the update and 
backout audit for such transactions than to force the up- 
daters to wait until the outcome of each transaction is 
known before applying the transaction's updates to the 
backup database. By simply applying all logical audit to 
the backup database, the updaters are able to keep the 
backup database substantially synchronized with the 
primary database. Also, this technique avoids dierup- 
tions of the RDF system caused by long running trans- 
actions. In previous versions of the Tandem RDF sys- 
tem, long running transactions would cause the backup 
system to completely stop applying audit records to the 
backup database until such transactions completed. 
[0032] The audit image records in each image trail 
1 36, 1 38 are typically read and processed by one to ten 
Updaters 134. Each Updater 134 reads all the audit im- 
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age records in the corresponding image trail, but utilizes 
only the audit image records associated with the primary 
disk volume 106 for which that Updater is responsible. 
[0033] At periodic intervals, each Updater durably 
s stores its current image trail position to disk in a context 
record. This position is called the Restart image trail po- 
sition. 

[0034] When an Updater process 134 reaches a limit 
position specified by the Receiver, which is treated by 
10 the Updater as the logical end of file of the image trail 
136, 138 to which it is assigned, it performs a wait for a 
preselected amount of time, such as two to ten seconds 
before sending another message to the Receiver to re- 
quest an updated limit position. Only when the limit po- 
tt sition is updated can the Updater read more audit image 
records. 

Monitor Process - Overview 

20 [0035] Monitor process 140 and another process 
called RDFCOM (which will be collectively referred to 
as the Monitor for the purpose of this document) are is 
used to coordinate tasks performed in response to user 
commands to the RDF system. 

25 

RDF Configuration File 

[0036] Referring to Fig. 3, the structure of each RDF 
system 120 is represented by a configuration file 180 
30 that is stored on the control volume of the primary sys- 
tem 110 and the control volume of the backup system 
122 associated with the RDF system. The RDF config- 
uration file 1 80 includes 

35 • a global RDF configuration record 1 81 ; 

• a Monitor configuration record 1 82 for identifying 
characteristics of the RDF system's Monitor proc- 
ess; 

• an Extractor configuration record for 1 83 for identi- 
40 fying characteristics of the RDF system's Extractor 

process; 

• a Receiver configuration record 1 84 for identifying 
characteristics of the RDF system's Receiver proc- 
ess; 

45 • a Purger configuration record 185 for identifying 
characteristics of the RDF system's Purger proc- 
ess; 

• one Updater configuration record 186 for each of 
the RDF system's Updaters, for identifying charac- 

so teristics of the corresponding Updater process; and 

• one Image Trail configuration record 187 for each 
image trail in the backup system. 

[0037] The information stored in the global RDF con- 
55 figuration record 181 includes: 

• the node name of the primary system; 

• the node name of the backup system; 



EP 1 093 055 A2 



15 



20 



25 



30 



35 



40 



45 



5 



9 



EP 1 093 055 A2 



10 



• the control subvolume used by the RDF system; 

• the time that the RDF system was initialized; 

• the name and location of the RDF system's log file; 

• the number of image trails in the backup system; 

• the number of protected volumes, which is also the 
number of Updaters in the backup system; 

• the number of message buffers used by the RDF 
system; 

and other information not relevant here. 
[0038] Each of the process configuration records 
182-187 includes information identifying the CPUs on 
which that process and its backup runs, the priority as- 
signed the process, the name of the process, and so on. 
In addition, the Receiver configuration record 184 also 
specifies the size of each of the image trail files and the 
volume used to store the master image trail files. 
[0039] The Purger configuration record 185 includes 
a parameter called the image trail RetainCount, which 
indicates the minimum number of image trail files to be 
retained for each image trail. 

[0040] The Updater configuration records 186 each 
identify the image trail from which the associated Up- 
dater process is to read audit information, the primary 
volume whose audit information is to be processed by 
the Updater, and the backup volume to which database 
updates are to be applied by the Updater. 
[0041] Each Image trail configuration record 1 87 iden- 
tifies the disk volume on which the image trail files for 
the corresponding image trail are to be stored. 

Using Parallel RDF Systems - Overview 

[0042] Referring to Fig. 4, there is shown a system in 
which data volumes 106 on a primary system 110 are 
protected by two or more parallel RDF systems 220. 
Each RDF system 220 contains its own copy of all the 
processes and data structures shown in Fig. 1 for a sin- 
gle RDF system 120. 

[0043] Identical copies of the entire configuration file 
for each RDP system are stored on the primary and 
backup systems, while the context, exceptions and im- 
age files are only on the backup system. 
[0044] Having multiple backup copies of a database 
is especially useful in at least two commercial settings: 

1) Applications that perform intensive read only 
(browse mode) queries. A classic example of this 
would be a telephone billing system in which billing 
database updates are performed on the primary 
system and telephone directory inquiries are per- 
formed on the backup system. 

2) Applications in which "triple contingency n protec- 
tion is required. The relevant triple continency is the 
failure of the primary database system and one re- 
motely located backup system (two contingencies) 
during overlapping time periods (the third contin- 
gency). In particular, in such applications, it is un- 



acceptable to run applications on a single system 
without a backup system. Rather, it is required (A) 
that the primary system have at least two parallel 
backup systems, (B) after losing the primary sys- 

s tern, one backup system is set up as the new pri- 
mary system, (C) another backup system is set up 
as the backup to the new primary system, and (D) 
a new RDF system is established to replicate data 
from the new primary system onto that other backup 

10 system. Thus data on the primary system, even 
when it is actually a former backup system, is al- 
ways protected by at least one RDF system. Exam- 
ples of systems where triple contingency protection 
may be required are large banking systems, or a 

15 national monetary transaction or clearing house 
system. 

[0045] Having a single RDF system configured to rep- 
licate databases across multiple backup systems is not 

20 practical for a number of reasons. For example, the Ex- 
tractor process would be required to ship an audit buffer 
to multiple backup systems. But if the communication 
path to even one of the backup systems went down, ei- 
ther the Extractor system would have to cease shipping 

25 audit information to all the backup systems until the 
communication path problem were solved, or it would 
need to keep track of what audit information had been 
shipped to each of the backup systems (which would be 
inefficient). As a result, when multiple backup systems 

30 are needed, multiple RDF systems 220 with a common 
primary node are used. 

[0046] In order to keep track of the locations of the 
files used by each of the parallel RDF systems 220, the 
following file naming convention is used in a preferred 

35 embodiment. The "pathname" of each RDF system's 
configuration file is preferably of the form "$SYSTEM. 
xxx.conf ig tt where $SYSTEM is the always the name of 
the control volume of any node in the system 1 00, °con- 
fig" identifies the file as an RDF configuration file, and 

40 js a » SU bvolume n name that uniquely identifies the 
RDF system 120. When a primary system 110 is pro- 
tected by more one RDF system, each of those RDF 
systems will have a different subvolume name. In the 
preferred embodiment, the subvolume name assigned 

45 to each RDF system is composed of the node name of 
the primary system and a one alphanumeric (e.g., 1. 
2, ... or any letter) character subvolume suffix. For in- 
stance, if the node name of the primary system 110 is 
"A", and two parallel RDF systems are used, their re- 

50 spective config files would likely be named $SYSTEM. 
Al.config and $SYSTEM.A2.config. 
[0047] As shown in Fig. 4, similar file naming conven- 
tions are used for the context file, exceptions file and 
image files of each RDF system 220, as explained 

55 above. Each RDF system's context file stores all the 
context records for that system. Each time a context 
record is durably stored, that record is stored in the con- 
text file on disk. The exceptions files and image files are 
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discussed in more detail below. In the preferred embod- 
iment, image trails are stored on user selected volumes, 
which are different than the control volume $SYSTEM, 
but they still use the same "xxx" control subvolume 
name as the corresponding configuration and context 
files. 

[0048] It should be noted that the RDF configuration, 
context and Exceptions files previously stored on a 
backup system's control subvolume (e.g., $SYSTEM. 
A1) must be deleted before a new RDF configuration 
using the same backup system can be initialized. The 
RDF system will automatically purge any old image trail 
files when a new RDF system is first started. 

Audit Record Types 

[0049] The master audit trail (MAT) 104 contains the 
following types of records: 

• Update records, which reflect changes to a data- 
base volume made by a transaction by providing 
before and after record images of the updated da- 
tabase record. Each update record indicates the 
transaction ID of the transaction that made the da- 
tabase change and the identity of the database vol- 
ume and database record that has been updated. 

• Backout records, which reflect the reversal of pre- 
vious changes made to a database volume. The da- 
tabase changes represented by backout records 
are sometimes herein called update backouts and 
are indicated by before and after record images of 
the updated database record. Backout audit 
records are created when a transaction is aborted 
and the database changes made by the transaction 
need to be reversed. Each backout record indicates 
the transaction ID of the transaction that made the 
database change and the identity of the database 
volume and database record that has been modi- 
fied by the update backout. 

• Transaction state records, including commit and 
abort records, transaction active records, as well as 
transaction prepared and transaction aborting 
records. Commit and abort records indicate that a 
specified transaction has committed of aborted. 
Transaction active records (also sometimes called 
transaction alive records) as well as transaction 
prepared and transaction aborting records indicate 
that a transaction is active. Each transaction state 
record indicates the transaction ID of the transac- 
tion whose state is being reported. Every active 
transaction is guaranteed to produce one transac- 
tion state record during each TMP control time 
frame (i.e., between successive TMP control 
points). A transaction active record is stored in the 
master audit trail if the transaction does not commit 
or abort during a TMP control time frame. 

• TMP control point records, which are "timing 



markers 0 inserted by the TM F 1 02 into the mas- 
ter audit trail at varying intervals depending on 
the system's transaction load. During heavy 
transaction loads, TMP control point records 
5 may be inserted less than a minute apart; at 

moderate transaction loads the average time 
between TMP control point records is about 5 
minutes; and under very light loads the time be- 
tween TMP control point records may be as 
long as a half hour. The set of audit records be- 
tween two successive TMP control point 
records are said to fall within a "TMP control 
time frame". 

• Stop Updaters records, which cause all Up- 
daters to stop wh en they read this record in their 
image trails. 

• Other records not relevant to the present dis- 
cussion. 

Detailed Explanation of Extractor Process 

[0050] Referring to Figs. 5A and 5B, the primary data 
structures used by the Extractor process 130 are as fol- 
lows. As stated earlier, the Extractorprocess 1 30 utilizes 
two or more message buffers 242. A portion of each 
message buffer 242 is used to store a "header" 280, 
which includes (A) a message sequence number and 
(B) a timestamp. The body 282 of the message buffer 
242 is used to store audit image records 284. Each audit 
image record 284 includes an audit information portion 
286, a MAT position value 288 and a RTD (relative time 
delay) timestamp value 290. The audit information por- 
tion 286 is copied from the audit record in the MAT 104, 
while the MAT position 288 of the audit record and RTD 
timestamp field 290 are added by the Extractor process 
to create an "audit image record" 284. 
[0051] The audit information portion 286 consists of 
the standard information found in audit records in the 
MAT 104, such as before and after field values for a 
modified row in a database table, or a commit/abort in- 
dication for a completed transaction. Other audit records 
in the MAT that are relevant to this document are the 
other types of transaction state records mentioned 
above, TMP control point records, and "Stop Updaters" 
audit records. 

[0052] The Extractor process 130 also maintains a 
message buffer status table 294, which indicates for 
each message buffer whether that buffer is available for 
use, or not available because it is currently in use by the 
Extractor. In addition, the Extractor process 130 main- 
tains a message sequence number in register 295, a 
MAT file pointer in register 296, a local timestamp value 
in register 297, and a scratch pad 298 in which it stores 
audit image records that it is currently processing. 
[0053] Finally, the Extractor process 130 includes a 
data structure 299 for storing reply messages received 
from the Receiver process 132. This data structure in- 
cludes a first field indicating the type of message re- 
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ceived, which is equal to either "message buffer ac- 
knowledgment" or "resynch reply", a message buffer 
identifier, and a "message value" field. The message 
value field is equal to a MAT position value when the 
message type is "resynch reply," and is equal to either 
an "OK" or "Error" condition code when the message 
type is "message buffer acknowledgment". 
[0054] Referring to Figs. 6A-6E, the Extractor process 
130 works as follows. 

[0055] The Extractor Startup Procedure 300 is called 
whenever the Extractor process 1 30 or its backup starts 
up, as in the case of a failover or a transfer of control 
back to the primary Extractor process 130 from the 
backup Extractor process. The Startup procedure be- 
gins by performing a "static initialization" of the Extractor 
process (302), which means that all static data struc- 
tures used by the Extractor process are allocated and 
initialized. While initializing static data structures, the 
Extractor process reads information denoting the set of 
RDF protected objects from the configuration file, and 
builds an internal table of RDF protected disk volumes. 
This table is used later as an audit record filter, such 
audit records for non-RDF protected volumes are ig- 
nored by the Extractor process. The startup procedure 
then creates a backup process (304). Then a checkpoint 
operation is performed in which a takeover location is 
transmitted to the backup Extractor process (306). The 
takeover location is, in essence a program address, and 
in the preferred embodiment the takeover location is the 
program location at which execution of the volatile ini- 
tialization procedure 310 begins. Finally, the Extractor 
Startup procedure calls (308) the Extractor Volatile Ini- 
tialization procedure 310. 

[0056] The Extractor Volatile Initialization procedure 
31 0 is called during startup by the Extractor Startup pro- 
cedure 300, when the Extractor receives an Error reply 
message from the Receiver, and whenever there is an 
Extractor process failure. The Extractor Volatile Initiali- 
zation procedure begins by allocating and initializing all 
volatile data structures used by the Extractor process, 
including message buffers 142, the message buffer sta- 
tus array 295 (312), and the message sequence number 
(which gets initialized to an initial value such as 1 ). Then 
the Extractor Volatile Initialization procedure transmits 
a Resynchronization Request message to the Receiver 
process (314) and waits for a Resynch Reply message 
(316). The Resynch Reply message will contain a MAT 
position value, which the Extractor Volatile Initialization 
procedure moves (318) into the MAT position MATpsn 
296. Finally, the Extractor Volatile Initialization proce- 
dure calls (320) the main Extractor procedure 330. . 
[0057] The Main Extractor procedure 330 begins by 
initializing and starting a timer called the Message Timer 
(MsgTimer) (332). The Message Timer is typically pro- 
grammed to expire in one second, although the timeout 
period is configurable to virtually any value. Next, the 
Extractor procedure reads a record in the MAT (334). If 
the MAT record is a logical audit (i.e., update or backout) 



record for an RDF protected volume, a transaction state 
record for any transaction, TMP control point record, or 
a "Stop Updaters" record, an audit image record is made 
by copying the MAT record and adding to it the MATpo- 

s sition of the current MAT record to the audit image 
record and by adding an RTD timestamp to the audit 
image record (336). The added RTD timestamp is the 
timestamp of the last transaction to complete prior to 
generation of the audit image record. Every time the Ex- 

10 tractor procedure encounters a commit or abort audit 
record, it moves a copy of the timestamp in that record 
into its local timestamp register 297. The value in the 
local timestamp register 297 is the RTD (relative time 
delay) timestamp that is added to audit records so as to 

is generate an audit image record, also known as an im- 
age record. 

[0058] If the message buffer currently in use has room 
for the resulting audit image record (338) it is moved into 
the message buffer (340). Then the Extractor procedure 
20 continues processing the next record in the MAT at step 
334. 

[0059] If the message buffer currently in use is full 
(338), the values stored in the message sequence 
number register 295 and the timestamp register 297 are 

25 inserted into the Message Buffer's header 280 (342). 
The Extractor procedure then transmits the message 
buffer to the Receiver process (344). After transmitting 
the message buffer, the Message Buffer Status array 
294 is updated to indicate that the message buffer just 

30 transmitted is not available for use. In addition, the Mes- 
sage Timer is cleared and restarted, and the Message 
Sequence Number in register 295 is increased by one 
(346). Finally, the audit image record that did not fit in 
the last message buffer is moved into the next available 

33 message buffer (348). If a next message buffer is not 
available, the Extractor procedure waits until one be- 
comes available and then moves the audit image record 
into it. Then the Extractor procedure continues process- 
ing the next record in the MAT at step 334. 

40 [0060] When the audit record read (334) from the MAT 
1 04 is not an audit record for an RDF protected volume, 
is not a transaction state record, is not a "Stop Updaters" 
record and is not TMP control point record, the audit 
record is ignored and the next audit record (if any) in the 

45 MAT is read (334). 

[0061 ] The purpose of the Message Timer is to ensure 
that audit image records are transmitted to the Receiver 
process in a timely fashion, even when the rate at which 
audit records are generated for RDF protected files is 

so low. Referring to Fig. 6D, when the Message Timer times 
out the Message Timer procedure 360 first checks to 
see if the current Message Buffer is empty (i.e. , contains 
no audit image records) (362). If so, a timestamp indic- 
ative of the current time is inserted into the Message 

55 Buffer header 280(364). If not, the timestamp value from 
the last commit/abort record, stored in RTD timestamp 
register 297, is inserted into the Message Buffer header 
(366). Then the current Message Sequence Number is 
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inserted in the Message Buffer header (368) and the 
Message Buffer is transmitted to the Receiver (370). Af- 
ter transmitting the message buffer, the Message Buffer 
Status Array 294 is updated to indicate that the message 
buffer just transmitted in not available for use, the Mes- 
sage Timer is cleared and restarted, and the Message 
Sequence Number in register 295 is increased by one 
(372). 

[0062] When the Extractor process receives a reply 
from the Receiver process acknowledging receipt of a 
message buffer (374), if the reply message indicates the 
message buffer was received without error, the Mes- 
sage Buffer Status Array 294 is updated to indicate that 
the message buffer identified in the reply message is 
available for use (376). 

[0063] If the reply message received by the Extractor 
process from the Receiver process indicates that the 
Extractor must restart, then the Extractor and Receiver 
must resynchronize with each other. The Receiver proc- 
ess tells the Extractor process to restart whenever (A) 
a message with an out-of-sequence Message Se- 
quence Number is received, and (B) whenever the Re- 
ceiver process starts up after a failover or return of con- 
trol back to the primary Receiver process from the back- 
up Receiver process (sometimes called a CheckS- 
witch). When the Extractor process receives an error 
condition reply message from the Receiver process that 
indicates the need to resynchronize, it waits for any 
pending message acknowledgment replies to be re- 
ceived for any other message buffers transmitted prior 
to receipt of the error condition reply message, and it 
ignores those reply messages (378). Then the Extractor 
process calls the Extractor Volatile Initialization proce- 
dure (379) so as to resynchronize the Extractor process 
with the Receiver process. 

Detailed Description of Receiver Process 

[0064] The primary data structures used by the Re- 
ceiver process 132 in the preferred embodiment are 
shown in Figs. 7A-7G. As stated earlier, the Receiver 
process durably stores a Receiver context record 270 
and a set of Image Trail context records 271 on a non- 
volatile (disk) storage device 272 on a periodic basis. 
The Receiver context record 270 includes a Receiver. 
StopUpdatersCnt count value 391, a 
Takeover_Completed flag 391 A (used to indicate when 
an RDF takeover operation has been completed), a 
NumNode array 391 B and previous SysTxList 391 C 
(used for purging old image trail files). 
[0065] Each image trail's context record 271 includes 
a MAT position, MIT position, the next write position. In 
some circumstances, the Receiver context record 270 
and a set of Image Trail context records 271 may be 
collectively called the Receiver context record or Re- 
ceiver context records, since these context records are 
collectively used to enable the Receiver process to re- 
start itself and to resynchronize with the Extractor proc- 



ess. 

[0066] Two image trail buffers 274 are used for each 
image trail, and these are used in alternating fashion. 
Referring to Fig. 7D, each image trail buffer 274 consists 
s of fourteen blocks 393 of data where the size of each 
block is 4K bytes. Each 4K block 393 begins with a block 
header 394 that includes: 

• the block's file storage location consisting of the rel- 
10 ative byte address (rba) of the beginning of the 

block with respect to the beginning of the image trial 
file; 

• a Master image trail (MIT) position indicator, indi- 
cating the location of the MIT block in which the Re- 
's ceiver last wrote a commit/abort record before any 

audit records were stored in the current image trail 
block 393; 

• a pointer to the first audit image record to start in 
the buffer block (i.e., in almost all circumstances the 

20 fjrst image record to start in the buffer will not be 
stored starting at the beginning of the body of the 
buffer block); 

• a pointer to the end of the last record to complete 
in the block; 

25 • a pointer to the next available byte in the block (if 
there is one); and 

• the MAT position of the audit image record at the 
beginning of the buffer block (which will usually be- 
gins in an earlier block). 

30 

[0067] Audit image records rarely conform exactly to 
buffer block boundaries, and therefore the audit image 
record at the end of one buffer block usually continues 
the beginning of the next, as shown in Fig. 15C. 

35 [0068] A typical MIT position value would be "10, 
8192", where the n lO n represents the file sequence 
number within the corresponding sequence of image 
trail files, and the "81 92" represents a relative byte offset 
from the beginning of the image trail file to a block head- 

40 er. 

[0069] As explained earlier, every audit record 
shipped to the Receiver process 1 32 has a MAT position 
value inserted in it by the Extractor process. The MAT 
position in an image trail context record 271 indicates 
45 the MAT position of the last audit record durably stored 
in the image trail file. 

[0070] The MIT position in an image trail context 
record 271 indicates a MIT position associated with the 
last durably stored image trail block. This is the MIT po- 
se sition in the last 4k block header of the last image trail 
buffer stored before the image trail context record 271 
was last stored. 

[0071] Furthermore, each image trail buffer 274 is 
written to the corresponding disk file only (A) when the 
55 image trail buffer 274 is full (i.e., contains 52K of data) 
or (B) when the Receiver process performs a periodic 
flush operation. Each time data from any image trail buff- 
er 274 is written to disk, the disk file location for the next 
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write to the image trail file (i.e., the disk address for the 
current end of the image trail file) is stored in the appro- 
priate field of the image trail context record 270. How- 
ever, as will be described below, the image trail context 
record is durably stored once every M seconds, where 
M is the number of seconds between executions of the 
Receiver context save procedure. 
[0072] The Receiver.StopUpdatersCnt 391 is a count 
value that is incremented each time the Receiver en- 
counters a StopUpdaters record in a received message 
buffer whose MAT value is higher than the MAT position 
for at least one image trail. This will be explained in more 
detail below. 

[0073] Referring to Fig. 7E, the image trail status array 
392 stores, for each image trail, a set of buffer location 
information, the MAT value of the last record stored in 
that image trail, a Mellow flag, and the current limit po- 
sition (i.e., the logical end of file). The buffer position 
information for an image trail includes pointers to the 
two buffers used by the image trail, an index indicating 
which of the two buffers is currently being written to, a 
pointer to the current block being written to, and a point- 
er (or offset) to the position within that block at which the 
next image record for the image trail will be written. The 
buffer position information is updated every time an au- 
dit record is added to an image trail buffer. The Mellow 
flag is used in association with the durable storage of 
image trail context records, as is described in more de- 
tail below with reference to Figs. 8C and 8J. The limit 
position indicates the last record in the image trail that 
should be read by any Updater processing the audit 
records in the image trail. 

[0074] The Receiver process also stores in memory 
a "Next Message Sequence Number 41 396, a "restart 
MAT position" 398, an "ExpectStopUpdate" flag 399, 
and a Takeover_Mode flag 399A. The Next Message 
Sequence Number 396 is the message sequence 
number the Receiver expects to see in the next mes- 
sage buffer and is normally incremented by one after 
each message buffer is received. During normal opera- 
tion, the restart MAT position 398 is set equal to the high- 
est MAT value of the audit records in the last message 
buffer that was property sequenced and successfully re- 
ceived from the Extractor. Whenever the Receiver proc- 
ess is started or restarted, however, the restart MAT po- 
sition 398 is initially set to the lowest of the MAT position 
values stored in the image trail context records 271 . The 
ExpectStopUpdate flag 399 is a flag set in response to 
a special "Expect Stop Update" message from the Mon- 
itor process just prior to a StopUpdaters audit record be- 
ing moved by the Extractor process into its current mes- 
sage buffer. 

[0075] The Takeover_Mode flag 399A is set whenev- 
er the backup portion of the RDF system is performing 
an RDF takeover operation. When the Takeover_Mode 
flag is set, the Receiver and Updaters operate differently 
than usual, as will be described in more detail below. 
[0076] Referring to Fig. 7F, an Updater table 400 is 



used by the Receiver to map Updaters to their image 
trails, and also to keep track of which Updaters have 
sent it messages. 

[0077] Referring to Fig. 7G, the Receiver uses a pair 
5 of "system transaction list" data structures SysTxList 
41 0 to keep track of the range of transaction IDs for ac- 
tive transactions being handled by each processor of the 
transaction management facility 102 (see primary sys- 
tem 100, Fig. 1). The transaction management facility 
TMF 102 may include multiple nodes, and each node 
can include up to sixteen processors. Furthermore, 
each processor in the TMF independently assigns mo- 
notonicalfy increasing sequence numbers to the trans- 
actions it executes. While the number of parallel proc- 
essors is potentially high, in practice the number of 
nodes in the TMF rarely exceeds four. 
[0078] Two pointers 41 2, 41 4 are used to point to the 
current and previous versions of the system transaction 
list 41 0. The current system transaction list is the one 
that is currently being updated by the Receiver process 
for the current TMP control period, while the previous 
system transaction list indicates the range of transaction 
IDs for active transactions for the previous (i.e., com- 
plete) TMP control period. The actual values stored in 
the SysTxList slots are just the lowest and highest se- 
quence numbers of active transactions for the relevant 
TMP control point period. The Node and Processor por- 
tions of the transaction IDs are not stored in SysTxList 
because they are indicated by the slot and subslot of the 
SysTxList where those values are stored, along with the 
NumNode entry pointing to the slot. 
[0079] A NumNode array 41 6 is used to map the node 
numbers of the nodes in the TMF 1 02 to slots in the cur- 
rent SysTxList. Null entries in NumNode are indicated 
by a predefined value, such as -1 . Each non-null entry 
of NumNode indicates the slot of the system transaction 
lists 410 to be used for the corresponding node of the 
primary system. For instance, if NumNode(15)-2, that 
indicates that node 15 of the primary system has been 
mapped to slot 2 of the system transaction lists 410 for 
purposes of keeping track of the range of active trans- 
actions. 

[0080] The NextSlot field 41 8 of the NumNode array 
indicates the next unused slot of the system transaction 
lists 410. 

[0081] Referring back to Fig. 7A, the Receiver Con- 
text record 270 includes a copy 391 B of the NumNode 
array and a copy 391 C of the previous SysTxList. The 
Purger Context record (not shown) includes a flag called 
the Undo List Written flag, the purpose of which will be 
explained below. 

[0082] Referring to Fig. 7H, each transaction ID has 
three components: 

• TxID.Node, and which identifies the TMF node on 
which the transaction was executed; 

• TxID.Proc, which in conjunction with TxID.Node 
identifies the TMF processor on which the transac- 
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tion was executed; 
• TxlD.Seq#, which is the sequence number as- 
signed to the transaction by the TMF processor that 
executed the transaction. 

5 

[0083] Referring to Figs. 8 A-8K, the Receiver process 
132 works as follows. 

[0084] Referring to Fig. 8A, the Receiver Startup Pro- 
cedure 440 is called whenever the Receiver process 
1 32 or its backup is started, as in the case of a failover 10 
or a transfer of control back to the primary Receiver 
process 132 from the backup Receiver process. The 
Startup procedure begins by performing a "static initial- 
ization" of the Receiver process (442), which means that 
all static data structures used by the Receiver process 1 $ 
are allocated and initialized. The startup procedure then 
creates a backup process (444). Then a checkpoint op- 
eration is performed in which a takeover location is 
transmitted to the backup Receiver process (446). The 
takeover location is, in essence a program address, and 20 
in the preferred embodiment the takeover location is the 
program location at which execution of the Receiver vol- 
atile initialization procedure 450 begins. Finally, the Re- 
ceiver Startup procedure calls (448) the Receiver Vola- 
tile Initialization procedure 450. 25 
[0085] Referring to Fig. 8B, the Receiver Volatile Ini- 
tialization procedure 450 is called during startup by the 
Receiver Startup procedure 440. The Receiver Volatile 
Initialization procedure 450 begins by reading the last 
stored Receiver context record and the last stored im- 30 
age trail context records from disk and using those con- 
text records as the Receiver's current context records 
in volatile memory (452). Then the Receiver Volatile In- 
itialization procedure allocates and initializes all volatile 
data structures (454) used by the Receiver process, in- 35 
eluding the image trail buffers 274, the image trail status 
array 394, the Updater status array 400, the NumNode 
array 416 and the current and previous system transac- 
tion lists 410. Then the Receiver Volatile Initialization 
procedure sets the Receiver's Expected Message Se- *o 
quence Number to "1 " (456). This will force the Receiver 
and Extractor to resynchronize, unless the Extractor is 
starting up at the same time such as in response to a 
"Start RDF" command. Finally, the Volatile Initialization 
procedure calls (458) the Main Receiver procedure 460. 45 
[0086] Referring to Figs. 8C-8K, the Main Receiver 
procedure 460 includes a subprocedure 470 for period- 
ically flushing image trail buffers to disk and for saving 
the Image Trail context records. This subprocedure is 
called every M seconds, where M is preferably a value so 
between 5 and 15 and is typically set to 5. At step 472, 
the image trail context save procedure performs a "lazy" 
flush of image trail buffers to disk. In particular, it checks 
the Mellow flag for each image trail. For each image trail 
having a Mellow flag that is set, the FlushlmageTrail pro- ss 
cedure is called. For each image trail having a Mellow 
flag that is not set, but for which any records have been 
written since the last image trail context save for that 



image trail, the Mellow flag is set. The FlushlmageTrail 
procedure is described below with reference to Figs. 8H, 
81 and 8J. 

[0087] It is noted here that the Receiver's context 
record is durably stored on disk whenever the M IPs con- 
text record is saved by the FlushlmageTrail and Com- 
pleteWritelnProgress procedures (described below with 
reference to Figs. 8H and 8J). 
[0088] Referring to Fig. 8H, the FlushlmageTrail pro- 
cedure uses "no-waited writes" to write the contents of 
an image trail buffer to disk. When a no-waited write is 
initiated, the process initiating the write is not blocked. 
Instead it continues with execution of the program(s) it 
is currently executing without waiting for the write to 
complete. However, each time the FlushlmageTrail pro- 
cedure is called for a particular image trail, the first thing 
it does is call the CompleteWritelnProgress procedure 
(shown in Fig. 81) to ensure that any previously initiated 
write for that image trail has completed successfully 
(step 475). Then the FlushlmageTrail procedure per- 
forms a no-waited write on the image trail buffer to disk, 
and resets the image trail's buffer position information 
to reference the beginning of the other buffer 274 forthe 
image trail (step 476). Because of the operation of the 
Complete Write-In Progress procedure, the other buffer 
274 for the image trail is known to be available for use 
when step 476 is executed. 

[0089] If the current image trail file is at or above a 
predefined maximum file size, a new image trail file is 
generated (477). Referring to Fig. 81, when a new image 
trail file is to be generated, an image trail file sequence 
number is incremented and the new file is generated us- 
ing the sequence number as part of its file name. Then, 
a copy of the previous SysTxList from the receiver con- 
text record is stored in the top of the new image trail file. 
[0090] Referring to Fig, 8J, the CompleteWriteln- 
Progress procedure immediately exits if no write forthe 
specified image trail is in progress (step 478-A). Also, if 
a previously initiated write is still in progress, the proce- 
dure waits until it completes (step 478-B). Also, if a pre- 
viously initiated write has failed, the write operation is 
repeated using a waited write operation until the write 
successfully completes (step 478-C). Next, if the Mellow 
flag of the image trail being processed is set, the Mellow 
flag is cleared, the Image Trail context record is durably 
stored and the LimitPosition forthe Image Trail is updat- 
ed (step 478-D). 

[0091] If the image trail being processed is the MIT, 
the NumNode array and previous SysTxList are copied 
into the Receiver context record, and the Receiver con- 
text record is durably stored (478D). 
[0092] Finally, the image trail buffer associated with 
the write operation that has completed is marked as 
available so that it can be used once again by the Re- 
ceiver process (step 478-E). 
[0093] The Receiver context save and image trail 
flushing procedures shown in Figs. 8C, 8H, 81 and 8J 
are very efficient, enabling the Receiver to manage 
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many image trails and save context in a timely manner. 
This can be best appreciated by reviewing the operation 
of these procedures in two exemplary situations. For 
each situation discussed, it is assumed that there are 
three image trail buffers; MIT, IT1) and IT2. 
[0094] Situation A. The context save timer pops and 
the Receiver's context save procedure is called. Be- 
cause the mellow flags for the image trails are not set, 
they are now set and the Receiver immediately resumes 
processing new audit sent by the Extractor. 
[0095] When the context save timer pops again and 
the context save procedure is called, it invokes the 
FlushlmageTrail procedure for each image trail because 
the mellow flag is set for each of the image trails. Since 
no writes are currently outstanding to each image trail 
file, the CompleteWriteln Progress returns immediately, 
and no waited writes are initiated to store the current 
image trail buffer for each image trail to disk. The alter- 
nate buffer for each trail becomes the new current buffer. 
Because these writes are no-waited, the Receiver im- 
mediately returns to processing now data from the Ex- 
tractor, storing said image audit in the new current buff- 
ers. 

[0096] When the Receiver's context save timer pops 
again and the Receiver context save procedure is 
called, the mellow flag is still set for each trail. Therefore 
the FlushlmageTrail routine is called for each image 
trail, which in turn calls the CompleteWriteln Progress 
routine for each image trail. Because these writes were 
initiated previously, the Receiver does not actually have 
to wait Assuming each previously initiated buffer write 
completed without error, the mellow flag is now cleared 
for each image trail and the context record for the image 
trails are written to disk using a waited write operation. 
However, since the context records are small, these 
writes are completed almost immediately. Each image 
trail's context record on disk now reflects all data just 
written. Program control then returns to the Receiver's 
context save procedure and then to the Receiver's main 
procedure, where it resumes processing new data from 
the Extractor. 

[0097] The context save and FlushlmageTrail proce- 
dures almost never wait for disk operations to be per- 
formed because the image trail buffer write operations 
complete between executions of the context save pro- 
cedures. As a result, the Receiver's processing of data 
from the Extractor is virtually uninterrupted by the image 
trail buffer flushing and context saving operations. This 
remains true even if the Receiver is servicing as a hun- 
dred image trails. 

[0098] Situation B. In this situation, so much audit is 
being sent to the Receiver that an image trail buffer nils 
before the context save timer pops. When a buffer write 
operation is initiated for each image trail, the alternate 
buffer becomes the current buffer. 
[0099] When the context save timer pops, the context 
save procedure is called. Because the mellow flag is not 
currently set, it is now set and the Receiver returns to 



processing new data from the Extractor. This allows 
more records to be stored in the current image trail buff- 
er. 

[0100] If the current image trail buffer is filled before 
5 the next Receiver context save, the FlushlmageTrail 
procedure is called. Before starting the write operation, 
the CompleteWritelnProgress procedure is called. Be- 
cause the previous write was no waited and was issued 
previously, that write will already have completed and 
10 the Receiver does not have to wait for that write opera- 
tion to complete. The CompleteWritelnProgress proce- 
dure clears the image trail's mellow flag, and durably 
stores the image trail's context record. Then the Flush- 
lmageTrail procedure issues a new no waited write for 
the full image trail buffer, makes the other buffer the new 
current buffer, and returns immediately to processing 
new audit from the Extractor, stored Next Message Se- 
quence Number (468). If the received message se- 
quence number is not equal to the locally stored Next 
Message Sequence Number, the received message 
buffer is discarded (480) and an Error Reply message 
is sent to the Extractor (482) indicating the need to re- 
establish synchronization. 

[0101] If the received message sequence number is 
in sequence, the locally stored Next Message Sequence 
Number is incremented by one (484) and a "Message 
Buffer OK" reply is sent to the Extractor (484). A mes- 
sage buffer identifier is associated with the received 
message and is also associated with the reply message 
so that the Extractor can properly update its message 
buffer status table by marking the acknowledged mes- 
sage buffer as available. 

[0102] Next, all the audit image records in the re- 
ceived message buffer are processed in sequence 
(490). For each record, the image trail associated with 
the record is determined (by determining the database 
volume updated on the primary system, determining the 
Updater responsible for replicating RDF protected files 
on that volume and then determining the image file as- 
sociated with that Updater) (492). Next, the MAT posi- 
tion (AuditRecord.MATpsn) in the audit record is com- 
pared with the MAT position (ITMATpsn) for the identi- 
fied image trail (494). If the audit record's MATpsn is not 
larger than the image trail's MATpsn, the audit record is 
ignored (496) because it has already been processed 
by the Receiver. Otherwise, the audit record is moved 
into the identified image trail buffer, and the associated 
image trail's current MAT position (ITMATpsn in the im- 
age trail status array 392) is updated to this audit 
record's MAT position (498). 
[0103] If the received record is a "Stop Updaters" 
record, step 492 determines that the record is associat- 
ed with all the image trials. The Stop Updaters record is 
written to the image trail buffer for each image trail 
whose MAT position (i.e., the MAT position of the last 
record written to the image trail) is less than the Stop 
Updaters record's MAT position (AuditRecord.MATpsn). 
Normally, unless there has been a recent Receiver proc- 
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ess failure, the Stop Updaters record will be written to 
every image trail. Next, all the image trails buffers to 
which the Stop Updaters record was written are flushed 
to disk and the corresponding Image Trail context 
records are updated and durably stored to disk. Once 
the Receiver detects that the image trail disk cache 
flushes and context record saves have When the con- 
text save timer pops again and the Receiver's context 
save procedure is called, the mellow flag is set and the 
Receiver returns immediately to processing new audit 
from the Extractor. 

[0104] When the current image trail buffer fills again 
and must be written to disk, the CompleteWriteln- 
Progress procedure is called by the FlushlmageTrail 
procedure. Again, there was a previous write, but it has 
already completed. Therefore the CompleteWritel- 
nProgress procedure clears the mellow flag and up- 
dates and durably stores the image trail's context 
record, which now reflects all audit image records writ- 
ten to disk by the write that just completed. The Fluslm- 
ageTrail procedure issues a new no waited write for the 
full image trail buffer, the buffer whose contents have 
already been written to disk is made the new current 
buffer, and then the Receiver returns immediately to 
processing new audit from the Extractor. 
[01 05] Thus, when under pressure from high amounts 
of audit being sent by the Extractor, the Receiver is able 
to update its context quickly and resume processing au- 
dit image records, only having to wait for the image trail 
context write to complete, but not having to wait at all 
for image trail buffer writes to complete. This is as effec- 
tive for a hundred image trails as it is for one. 
[01 06] The Receiver process 1 32 is a "passive" proc- 
ess in that it does not initiate messages to other proc- 
esses. Rather it only responds to messages from the 
Extractor process 130, messages from the Updater 
processes 134, and from the Monitor process 140. 
[01 07] Referring to Figs. 8D, 8E and 8F, when a mes- 
sage is received from the Extractor process (462), if the 
message is a Resynch request message, the Receiver 
determines which of the MAT positions listed in Image 
Trail context records is lowest (464), and sends a Re- 
synch Reply message to the Extractor with the deter- 
mined lowest MAT position embedded in the reply mes- 
sage (466). 

[0108] If the received Extractor message is a mes- 
sage buffer message, the message sequence number 
(denoted Message. SequenceNumber) in the received 
message is compared with the locally completed, the 
Receiver increments the Receiver.StopUpdatersCnt 
391 count value in its context record and durably stores 
the Receiver context record to disk. By following these 
steps the Receiver ensures (A) that each Stop Updaters 
record is durably stored to all image trails, and (B) that 
the Receiver.StopUpdatersCnt 391 count value is incre- 
mented once, and only once, for each distinct Stop Up- 
daters record. 

[0109] If the record is a transaction state record it is 



stored in the master image trail (498). Further, if the 
record is a transaction state record other than a commit 
or abort record, the Receiver also updates the current 
SysTxList by calling the Update Current SysTxList pro- 
5 cedure (498), described in more detail below with re- 
spect to Fig. 8K. This procedure updates the current 
SysTxList, when necessary, so as to indicate the full 
range of transaction IDs for active transactions during 
the current TMP control point period. 
[0110] If the received record is a TMP Control Point 
record, step 492 determines that the record is associat- 
ed with the MIT, where it is written in step 498. Further, 
if the received record is a TMP Control Point record, the 
Receiver swaps the SysTxList pointers, making the cur- 
rent SysTXList the previous SysTxList, and making the 
previous SysTxList into the current SysTxList. Further- 
more, the current SysTxList is cleared so as to store all 
null values (498). 

[0111] If moving an audit image record into an image 
trail buffer would overflow a 4K byte block in the image 
trail buffer (504), special processing is required (see de- 
scription of steps 51 0, 512 below). Furthermore, if mov- 
ing the audit record into the image trail buffer would 
overflow the last block in the image trail buffer (506) the 
entire image trail buffer through the last 4K block is du- 
rably stored in the associated image trail file (508) by 
calling the FlushlmageTrail procedure (see Figs. 8H, 81 
and 8J). 

[01 12] If a 4K byte block has been filled, the procedure 
sets up a new 4K block either in the same buffer if there 
is room for another 4K block, or in the other buffer for 
the image trail if the current buffer has been filed. In ei- 
ther case, the following information is stored in the block 
header for the new block: the position of the block in the 
image trail file, the current MIT file position (which is the 
MIT file and block header position associated with the 
last audit record written to the MIT message buffer), a 
pointer to the first record (if any) whose beginning is lo- 
cated in the 4K block, and the MAT position of the record 
located immediately after the block header (see earlier 
discussion of Fig, 7D). Then the process of moving the 
current audit record into the image trail buffer is com- 
pleted (512) and processing of the next audit record (if 
any) in the received message buffer begins at step 490. 
[0113] If the received message buffer was empty 
(520), the Receiver determines the highest of the MAT 
positions stored in the context records for all the image 
trails, which is equal to the MAT position of the last audit 
record received from the Extractor in the last message 
buffer received that contained any audit records. Then 
an "RDF control record" is moved into all the image trail 
buffers (524). The RDF control record denotes (A) the 
determined highest MAT position, and (B) the times- 
tamp value in the received message buffer's header. 
[01 14] If the received message buffer was not empty 
(520), but if one or more image trails received no audit 
records from the current message buffer (526), the Re- 
ceiver determines the highest of the MAT positions 
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stored in the context record for all the other image trails 
(528), which is equal to the MAT position of the last audit 
record received from the Extractor in the current mes- 
sage buffer. Then an "RDF control record" is moved into 
each image trail buffer that did not receive any audit 
records (530). The RDF control record denotes (A) the 
determined highest MAT position, and (B) the times- 
tamp value in the received message buffer's header. 
[0115] If the backup system is in Stop Updaters at 
Timestamp mode and the last audit record in the buffer 
had a timestamp greater than or equal to the StopTS, 
then the Receiver performs the following sequence of 
tasks (532). It flushes all image trail buffers to disk and 
durably saves the image trail context records. It copies 
the NumNode array and the previous SysTxList (i.e., for 
the last complete TMP control interval) to the Receiver 
context record and durably stores the Receiver context 
record. Finally, the Receiver will have received a request 
message from the Purger, and the Receiver replies to 
that request with a message that includes the end of file 
positions for all image trails and that enables the Purger 
to generate the Undo List (see Fig. 12). 
[01 1 6] Referring to Fig. 8G, when a limit position mes- 
sage is received from any Updater process (540), the 
Receiver sends a reply message to the requesting Up- 
dater with the LimitPosition location for that Updater's 
image trail (544). 

Updating the SysTxList Table 

[01 1 71 Referring to Figs. 7G and 8K, the Update Cur- 
rent SysTxList procedure 550 is called by the Receiver 
to process each logical audit record and each transac- 
tion state record other than a commit or abort record. 
The Receiver passes just the transaction ID (TxlD) of 
the record being processed to this procedure (551 ). The 
procedure uses the Node field of the received TxlD to 
look up in the NumNode array the Slot of the SysTxList 
assigned to that node (552). If no slot has been assigned 
to the node (553- Y), a slot is assigned to it by storing 
the NextSlot value in the appropriate entry of the Num- 
Node array and then incrementing NextSlot (554). 
[0118] The low and high sequence numbers for the 
Node and Processor associated with the received trans- 
action ID are read from the SysTxList (555) (i.e., at the 
slot and subslot of SysTxList for that Node and Proces- 
sor) and then compared with the Seq# field of the re- 
ceived transaction ID. If the Seq# field of the received 
transaction ID is less than the low sequence number 
stored in the SysTxList for that Node and Processor, the 
low sequence number field is replaced with the value of 
the Seq# field of the received transaction ID (556). Sim- 
ilarly, if the Seq# field of the received transaction ID is 
higher than the high sequence numberstored in the cur- 
rent SysTxList for that Node and Processor, the high se- 
quence number field is replaced with the value of the 
Seq# field of the received transaction ID (557). If the low 
and high sequence numbers stored in the SysTxList are 
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null, the Seq# field of the received transaction ID is 
stored in both (558). 

[0119] The primary data structures used by each Up- 
dater process 134 in the preferred embodiment are 
s shown in Fig. 9. Each Updater process durably stores 
a context record 570 on a nonvolatile (disk) storage de- 
vice on a periodic basis (e.g., once every 2 to 10 min- 
utes, with 5 minutes being preferred). As shown in Fig. 
9 the Updater context record includes: 

10 

• a Redo restart position 571 , indicating the position 
of the record immediately following the last image 
trail record processed by the Updater before the last 
Updater context save operation during a Redo 

*5 pass; 

• an Undo restart position 572, indicating the next im- 
age trail record to process during an Undo pass af- 
ter the last Updater context save operation; 

• a StopUpdaterCompleted flag 573, which is set 
20 when the Updater has stopped operation in re- 
sponse to reading a Stop Updaters record; 

• a StopUpdateToTime Completed flag 574A times- 
tamp-based Restart IT position 574A, used to indi- 
cating where to restart processing image trail 

25 records after a performing a "Stop Updaters at 
Timestamp 0 operation; 

• a Takeover_Completed flag 574B that is set when 
the Updater completes processing all the records in 
its image trail during an RDF takeover operation; 

30 • a Type of Pass indicator 574C, which indicates 
whether the Updaters is performing a Redo pass or 
an Undo pass (as explained below); 

• an End Time Position 574D, which indicates the 
record last processed at the end of a Redo pass, 

35 while performing a stop Updater at timestamp op- 
eration; and 

• a Start Time Position 574E, which indicates the last 
record to be undone during an Undo Pass, and thus 
indicates the first record to be processed (for redo) 

40 when the Updater is restarted after completing a 
Stop Updater at Timestamp operation. 

[0120] Each Updater also stores in volatile memory 

45 • a current image trail file position 575; 

• a local transaction status table 576; 

• a latest RTD (relative time delay) Timestamp value 
(577), equal to the last RTD timestamp of any image 
audit record processed by the Updater; 

so • a LimitPosition image trail file position (578); 

• a scratch pad (579) for processing audit records; 

• a Takeover_Mode flag 579A for indicating if the 
RDF system is in takeover mode; and 

• a Stop Timestamp 579B for indicating the times- 
55 tamp limit on transaction updates to be applied by 

the Updater; 

• a TypeOfPass value 579C indicating whether the 
Updater is performing a Redo Pass or Undo pass; 
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and 

• an EndTimePosition 579D and StartTimePosition 
579E for marking the End and Start image trail po- 
sitions for an Undo Pass. 

[01 21] The RTD Tlmestamp value 577 is used by the 
Stop Updaters at Tlmestamp procedure discussed be- 
low. In addition, il is accessible by procedures executed 
on behalf of the Monitor process 1 40 for monitoring how 
far the Updaters are running behind the TM/MP202, and 
thus how long it would take the RDF system 220 to eatch 
up the backup database 124 with the primary database 
108 if all transactions on the primary system were to 
stop. 

[0122] Referring to Figs. 10A-10F, the Updater proc- 
esses 134 work as follows. 

[01 23] Referring to Fig. 1 0A, the Updater Startup Pro- 
cedure 600 is called whenever any Updater process 1 34 
is started. The Updater Startup procedure begins by 
performing a "static initialization" of the Updater process 
(602), which means that all static data structures (such 
as a map of primary volumes to backup volumes) used 
by the Updater process are allocated and initialized. The 
startup procedure then creates a backup process (604). 
Then a checkpoint operation is performed in which a 
takeover location is transmitted to the backup Updater 
process (606). The takeover location is, in essence a 
program address, and in the preferred embodiment the 
takeover location is the program location at which exe- 
cution of the Updater Volatile Initialization procedure 
61 0 begins. Finally, the Updater Startup procedure calls 
(608) the Updater Volatile Initialization procedure 610. 
[0124] Referring to Fig. 10B, the Updater Volatile Ini- 
tialization procedure 610 is called during startup by the 
Updater Startup procedure 600. The Updater Volatile In- 
itialization procedure begins by reading the last stored 
Updater context record from disk and using it as the Up- 
dated current context record in volatile memory (612). 
Then the Updater Volatile Initialization procedure allo- 
cates and initializes all volatile data structures (614) 
used by the Updater process, including the scratchpad 
579. Then the Updater Volatile Initialization sends a Lim- 
itPosition request message to the Receiver, and stores 
the LimitPosition value in the resulting reply message in 
its local LimitPosition register 578 (616). 
[0125] If the StopUpdateToTlme Completed flag in the 
Updater context record is set (617), that flag is reset, 
the Redo restart position is set to the Start Time Position 
in the Updater context record, and disk errors are sup- 
pressed until the Updater reaches the End Time Position 
in the Updater*s image trail (618). 
[0126] Finally, the Volatile Initialization procedure 
calls (619) the main Updater procedure 620. 

Updater Redo Pass 

[0127] The Updaters have two types of operations: a 
redo pass and an undo pass. The redo pass is the nor- 



mal mode of operation, in which update and backout au- 
dit is applied to a backup volume. The undo pass is used 
for removing all database changes caused by incom- 
plete transactions. The redo pass (i.e., normal opera- 

s tion) will be explained first. 

[0128] While redoing database update and backout 
operations on a backup volume, the Main Updater pro- 
cedure 620 executes transactions that are distinct from 
the transactions performed by the primary system. In 

10 particular, the Updater treats all the operations it per- 
forms during a set period of time as a transaction. A tim- 
er is set at the beginning of the each Updater transac- 
tion. When the timer expires or "pops," the Updater 
transaction commits, which causes all the updates to the 

*5 backup volume made during the transaction to be made 
permanent. 

[01 29] The database changes made by the Updaters 
to the backup database are performed in the same TMF 
transaction processing environment as transactions on 

20 the primary system. Thus, all database changes made 
by an Updater are reflected in audit records generated 
by the TMF system on the backup database. Also, 
whenever an Updater performs a redo on an audit 
record, it replaces the transaction ID in the audit record 

25 with the transaction ID for the Updater transaction so 
that the disk process that performs the database update 
can apply conventional commit/abort logic to the Up- 
dater transactions. 

[0130] Referring to Figs. 10C-10F, the Main Updater 
30 procedure 620 includes a subprocedure 630 for saving 
the Updaters context record. This subprocedure is 
called (A) whenever a predefined amount of time has 
elapsed since the current Updater transaction was start- 
ed, and (B) at various other times such as when opera- 
35 tion of the Updater is being stopped. In a preferred em- 
bodiment, the transaction Updater timer expires or pops 
once every five minutes. 

[0131] The first step (632) of the Updater Context 
Save procedure 630 is to commit the current Updater 

40 transaction. This makes permanent all the database 
changes to the backup database made during the cur- 
rent Updater transaction. Then the procedure stores the 
current image trail position in the Redo Restart Position 
in the Updaters context record if the Updater is perform- 

45 ing a Redo Pass. If the Updater is performing an Undo 
Pass, it stores the current image trail position in the Un- 
do Restart Position in the Updaters context record (633) 
The Updater context record 570 is then durably stored 
to disk (634) using a "wait until complete" write operation 

so so as to ensure that the context record is durably saved 
before any further operations can be performed by the 
Updater. Also, the Updater sends a purge request to the 
Purger process if at least a predefined amount of time 
(e.g., 5 minutes) has elapsed since the last such request 

55 was sent (635). The purge request includes the SysTx- 
List from the image trail file currently being processed 
by the Updater, and requests the Purger to purge image 
trial files no longer needed by the Updater. 
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2lf M Re err,n 9 toFi 9S- l0Dand10E,theprimary,ob 
of the Mam Updater procedure 620 is to process audit 
'mage records in its image trail. When the Main Updater 
procedure first starts performing a Redo operation, it re- 
sets and starts the Updated transaction timer, and 
starts a new Updater transaction (621). The transaction 
ID for the Updater transaction is checkpointed to the 
backup Updater process. If the primary Updater fails, 
the backup Updater uses the transaction ID to deter- 
mine when the disk process has finished aborting the 
Updater transaction that was last being performed bv 
the pnmary Updater process. Then, when the disk proc- 
ess finishes backing out all database changes made by 
that Updater transaction, the backup Updater process 
resumes processing audit records in the image trail at 
the Redo restart position (or the Undo restart position in 
the event of an Undo Pass). 
[0133] At step 622 it reads the next audit image 
record, .f any, in the image trail, and updates its locally 
stored latest RTD Timestamp" 577 value with the RTD 
timestamp from the audit image record. If the Stop 
Timestamp value 579B is not zero, indicating that the 
Updater is performing a Stop Updaters at Timestamp 
operation, and the RTD timestamp in the audit image 

«™ * T? t0 ° r 9reater than the S, °P Timestamp 
E. P Sr procedure Performs a set of steps 
(625) for stopping the Updater. In particular: 



30 



the End Time Position is set to the current image 
trail position; a 

the TypeOfPass field (579C, Fig. 9) in the Updated 
context record is set to Undo; 
the Updaters context is saved (see Fig. 10C)- and 
the Updater performs the Updater Undo Pass (de- 
scribed below with reference to Figs. HAand 14B) 



record and then enters either the "Start Update" or 
Takeover" command. 

[0137] if the audit record just read is an update or 
backout record, a redo of the update or backout opera- 
hon noted in the audit record is initiated against the 
backup database file (646). 

[0138] When the attempt to read a next audit record 
(622) encounters an audit record at or beyond the Lim- 
(Position value in LimitPosition register 578, a LimitPo- . 
sition request message is sent to the Receiver (660) to 
determine whetherthe LimitPosition forthe Updater has 
been advanced. When a reply message is received the 
LimitPosition value in the received message is com- 
pared wrth the locally stored LimitPosition value (662) 
If the received LimitPosition is not larger than the previ- 
ously stored LimitPosition value, the Updater 134 can- 
not process any further audit image records. As a result 
the Updater waits for Wseconds (664), where W is pref- 

20 !n ^M! 31 " 6 b6tWeen 1 and 10 and is ^^"y set to 
10, and then sends another LimitPosition request mes- 
sage to the Receiver (660). This continues until a Lim- 
itPosition value larger than the current LimitPosition is 

^H 6 « d f T the R6Ceiver - At that P° int »» locally 
stored LimitPosition value in LimitPosition register 578 
» .s replaced with the LimitPosition value in the received 
reply message (666), and then processing of audit im- 
age records resumes at step 622. 
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RDF Takeover Procedure 



[0134] If the Stop Timestamp value is zero orthe cur- 
rent record's RTD timestamp is less than the Stop 
Timestamp (623-N), then the Main Updater procedure 
continues with normal processing of the image trail 
record read at step 622. 

[0135] If the audit record just read is an "RDF Control" 
record, no further processing of that record is required 

m^cl 00638 '" 9 refeumes with next audit rec °«l (622) ' 
[0136] lftheauditrecordjustreadisa"StopUpdaters" 
record, the StopUpdaterCompleted flag 574 in the Up- 
dater context record 570 is set to True (640) and the 
updater context saveprocedure 620 is called (642) The" 
StopUpdaterCompleted flag 574 is read by the Monitor 
process on the next Start RDF orStart Update operation 
to ensure that all Updaters have stopped and that all 
have processed their image trails through the StopUp- 
daters record (as opposed to stopping due to a failure) 
Then the Updater* backup process is terminated and 
tne Updater process itself terminates (644) The Up- 
dater process will start up again after the operator of the 
hdf system performs on the remote backup system the 
DDL operation that created the Stop Updaters audit 



P>139] Referring to Fig. 11 , the RDF Takeover proce- 
dure begins when an operator of the system sets the 
takeover mode flags (399A, Fig. 7A, and 579A Fig 9) 
35 '"? e T R L eceiver - Updater and Purger processes. The 
RDF Takeover procedure is prevented from being exe- 
Primary s ^ tem is «™ operating. Thus, when 
ttie RDF Takeover procedure is executed, there are no 
longer any message buffers of audit image records be- 
ing received by the Receiver from the Extractor 
[0140] In response to the Takeover notification, the 
Receiver completes all processing of previously re- 
ceived message buffers (720), flushes all the image trail 
buffers to disk, updating the limit positions for all the im- 
age trails accordingly (722), and durably stores the Re- 
ceiver and Image Trail context records to disk (724) 
[0141] The Purger responds to the Takeover notifica- 
tion by sending the Receiver a request for permission 
to generate an Undo List file. Similarly, when the Up- 
daters finish processing all audit records in their respec- 
» live .mage trails, they respond to the Takeover notifica- 
tion by sending the Purger a request for permission to 
perform an Undo Pass. 

[0142] After completing steps 720, 722, 724 the Re- 
cen/er replies to the Purger request, enabling it to gen- 

* eratetheUndoList(725).ThePurgergeneratestheUn- 
do ust file (726). and then grants permission to the Up- 
daters to perform an Undo Pass (727). The Update* 
respond by reversing the affects of all update audit for 
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the transactions listed in the Undo List. Upon completing 
the Undo Pass, each Updater sets its 
Takeover_Completedflag (574C, Fig. 9), durably stores 
its context record and terminates (728). The Updater 
Undo Pass is discussed in more detail below with refer- 
ence to Figs. 14A and 14B. 

[01 43] When all the Updaters have shut down, the Re- 
ceiver reads the Updater context records. If every Up- 
dater has set its Takeover_Completed flag, then the Re- 
ceiver sets its Takeover_Completed flag (391 A, Fig. 
7A), durably stores its context record, arid generates a 
"RDF Takeover completed" log message (stored in the 
log file for the RDF system) indicating the MAT position 
of the last audit image record stored to disk (730). 
[01 44] However, if any of the Updaters fail before set- 
ting their Takeover_Completed flag, the Receiver will 
detect this and will generate a corresponding RDF take- 
over failure log message, in which case the system's 
operator will need to re-execute the RDF takeover com- 
mand. 

Generating the Undo List 

[0145] It is noted here that the Undo List is generated 
not only during a takeover operation, but also when a 
Stop Updaters at Tlmestamp operation is performed. 
[01 46] Referring to Fig. 1 2, the Undo List can be cre- 
ated by either the Receiver or by another process. For 
the purposes of this explanation, it will to be assumed 
that the Undo List is generated by a process herein 
called the Purger. However, in other embodiments the 
Undo List could be generated by the Receiver or another 
process. Further, in some embodiments, one process, 
such as the Receiver, could generate the Undo List dur- 
ing a takeover operation, while another process such as 
the Purger could generate the Undo List during a Stop 
Updaters at Tlmestamp operation. 
[0147] In the preferred embodiment, the Purger re- 
quests permission from the Receiver to generate the 
Undo List, and then waits until the Receiver grants that 
permission (740). The Receiver grants the Purger per- 
mission only after it is sure that all information needed 
by the Purger has been durably stored. 
[0148] The Purger begins by creating a transaction 
status table (TST) that is accessed using a hash table 
(750). Referring to Fig. 13, the TST 742 stores, for each 
transaction for which information is stored in the table, 
the transaction ID 744, and the final state 746 of the 
transaction, if it is known. A hash table 748 is used to 
locate items in the TST 742. In particular, the transaction 
identifier (TxlD) of a transaction is converted into a hash 
table index by a hash function 749, and then an item in 
the hash table either at the index position or after the 
index position contains a pointer to the TST entry for 
that transaction. The TST 742 is preferably filed with en- 
tries in sequential order, starting either at the top or bot- 
tom of the TST. 



Traverse MIT Backwards and Fill In Transaction Status 
Table 

[01 49] Next, the Purger traverses Master Image Trail 
5 (MIT) backwards (751 ). If a takeover operation is being 
performed, the Purger reads the MIT backwards from 
its end of file. If a Stop Updaters at Timestamp operation 
is being performed, the Purger finds the starting point 
for the backward pass by reading the MIT backwards 
10 until it reads an audit record whose timestamp is less 
than the Stop Timestamp. Then it starts the backward 
pass at that record. 

[0150] In either case, the Purger continues reading 
the MIT backwards until it has read backward through 
one complete TMP control interval. Generally, this 
means that it reads backwards until it has read two TMP 
control point records. The MAT position of the TMP con- 
trol point record that completes the backward pass is 
stored as the "EndMAT" position. 
[0151] For each transaction state record in the MIT 
that is read during this backward pass, the transaction 
state is stored in the transaction table as the final state 
for that transaction only if no information about the trans- 
action has been previously stored in the transaction ta- 
ble (751 ). In other words, only the last transaction state 
value in the MIT is stored in the transaction table. Also, 
if the last known state for a transaction is not commit or 
abort, it is denoted as "unknown" in the table. Since the 
state of every active transaction must be represented 
by a transaction state record during each TMP control 
interval, except for transactions that started during that 
TMP control interval, the backward pass will identify all 
transactions whose state is known at the point in time 
in the primary system represented by the last of the audit 
records received by the backup system. 
[0152] Next, the Purger traverses each of the other 
Image Trails backwards from its end until it reaches a 
record whose MAT is less than the EndMAT position. 
For each image trail record, it is determined if the cor- 
responding transaction is represented in the transaction 
status table. If so, nothing further is done for that record. 
Otherwise, a new entry is made in the transaction status 
table, and the status of the corresponding transaction is 
denoted in the table as "unknown" (752). When all the 
image trail files have been processed in this way, the 
transaction status table will contain entries for all trans- 
actions for which (A) there is at least one audit record 
in the image trail files and (B) the outcome of the trans- 
action (commit or abort) is unknown. The Purger next 
constructs a compact list of all the transactions in the 
transaction status table whose status is denoted as "un- 
known" (754). This is preferably done by storing these 
entries at the top of the transaction status table, and the 
resulting table of transactions is herein called the "com- 
pressed transaction status table." The hash table for the 
transaction status table is rebuilt to include only entries 
for transactions whose status is denoted as unknown 
and to point to the remaining transactions in their new 
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locations. 

[0153] Next, the Purger determines the LowWaterMIT 
position (756). To do this, the Purger reads the MIT 
backwards until it finds an TMP control interval in which 
there are no transaction state records for any transac- 
tions in the compressed transaction status table. The 
LowWaterMIT is set to the MIT position of the TMP con- 
trol point record at the beginning of the first TMP control 
interval found that meets this requirement. Alternatively, 
the MAT position for this TMP control point record could 
be retained. 

[0154] The Purger generates an Undo file, herein 
called the Undo List or the Undo List file (758), that in- 
cludes: 

• the LowWaterMIT; 

• a parameter indicating the. number of transactions 
included in the Undo List (which is the same as the 
number of transactions denoted in the compressed 
transaction status table); and 

• a list of the transaction IDs of all the transactions in 
the compressed status table. 

[0155] In one embodiment, the Undo List may be 
stored as a set of blocks, each of which contains up to 
N (e.g., 510) transaction IDs, as well as the LowWater- 
MIT and an indicator of the number of transaction IDs 
listed in that block. 

[0156] When the Purger has finished generating the 
Undo List, it sets an Undo List Written flag in its context 
record to True, and durably stores its context record 
(760). Also, it responds to pending requests from the 
Updaters to grant them permission to perform an Undo 
Pass. 

[0157] If the Purger fails and restarts in Takeover 
mode or Stop Updaters at Timestamp mode, and the 
UndoListWritten flag is set to False, it purges the Undo 
List file (if one exists) and starts building the Undo List 
from scratch. 

Updater Undo Pass 

[0158] Referring to Figs. 14A and 14B, in takeover 
mode, after each Updater finishes its Redo Pass, it re- 
quests permission from the Purger to perform an Undo 
Pass (770). The Purger responds to that request only 
after it completes generation of the Undo List. 
[0159] Upon receiving such permission, the Updater 
checks to see if the Undo List is empty (772). If so, it 
stops and ends the Undo Pass. Otherwise, it stores 
(774) all entries in the Undo List in a local transaction 
status table (local TST), which may have essentially the 
same structure as the transaction status table shown in 
Fig. 13, except that the Final State column is not needed 
because all transactions listed in the table are assumed 
to be transactions whose final state is unknown. 
[0160] Next, the Updater undoes all updates associ- 
ated with incomplete transactions (776). This will be de- 



scribed in more detail below with reference to Fig. 1 4B. 
Next, if the backup system is takeover mode, the Up- 
dater sets its Takeover_Completed flag (777). If the 
backup system is in Stop Updaters at Timestamp mode, 
5 the Updater sets the TypeOf Pass context record field to 
Redo, sets the StopUpdateToTlme Completed flag to 
True, and sets the StartTlmePosition field to point to the 
last image trail record processed by the Undo Pass 
(778). Then the Updater durably stores its context 
records (779), and exits by terminating the Updater 
process and the backup Updater process (779). 
[0161] Referring to Fig. 14B, the Undo Pass starts at 
step 780, with the Updater starting a transaction timer 
(e.g., a 5 minute timer) and starting a new Updater trans- 
action. Then the Updater reads its image audit trail file 
backwards (781), starting with the last record the Up- 
dater applied to the backup database, until it reads a 
block header (782) having a MIT indicated in its header 
that is less than or equal to the LowWaterMIT (783). 
[0162] All complete records in the block having this 
MIT are processed, but no earlier blocks in the image 
trail are processed. For each audit record representing 
an update, the Updater checks the local TST (784). If 
the transaction ID for the transaction is not present in 
the local TST, the audit record is not further processed 
(784-No). On the other hand, if the transaction ID for the 
transaction is present in the local TST, the update rep- 
resented by the audit record is undone (785), and a cor- 
responding exception record is written to an exceptions 
log. As many undo operations as can be performed dur- 
ing each transaction timer period are performed as a sin- 
gle Updater transaction. When the transaction timer 
pops (786), the current Updater transaction is commit- 
ted (787). In addition, the Updater saves its current im- 
age trail position in the Undo position field of its context 
record and durably saves its context record (787). 
[0163] When the Updater finishes processing all the 
complete records in an image trail file block whose 
header indicates a MIT that is less than or equal to the 
LowWaterMIT (783), the current Updater transaction is 
committed and the Undo pass ends (788). 

Detailed Explanation of Stop Updaters at Timestamp 
Procedure 

[0164] If the primary system is in active use and the 
Updaters are active on a backup system, the data vol- 
umes on the backup system will not be in a completely 
consistent state because some transactions will be only 
partially stored to disk. Because the Updaters operate 
asynchronously with respect to each other, some Up- 
daters may have already applied audit associated with 
some transactions, while other Updaters have not yet 
processed audit associated with that same set of trans- 
actions. While this "inconsistent state" problem is of no 
consequence for most casual database inquiries (e.g., 
a "browsing" or "read only" query about the number of 
seats available on a particular airplane flight), it is intol- 
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erable for tasks that require consistent or stable access, 
such as generating monthly reports and other important 
management data summaries that must be totally inter- 
nally consistent. 

[0165] The "Stop Updaters at Timestamp" feature of 
the present invention brings the backup database to a 
consistent state without affecting the primary system's 
operation in any way. Referring to step 623 and 625 of 
Fig. 10D, when the Stop Updaters at Timestamp feature 
is in use, the Updater automatically stops applying audit 
when it reaches an audit record whose timestamp is 
greater than or equal to the stop timestamp (StopTS). It 
also sets the EndTimePosition field in its context record 
to the current position in the image trail (i.e., the first 
record in the image trail not applied to the backup data- 
base), and sets the TypeOfPass field to Undo. Then it 
saves its context, which marks the end of the Redo 
Pass, and then performs an Undo Pass. 
[0166] The Purger, while generating the Undo List, 
operates slightly differently in Stop Updaters at Times- 
tamp mode than in takeover mode. In particular, while 
traversing the MIT backwards (step 751 , Fig. 12), each 
transaction whose final state record has a timestamp 
that is greater than (i.e., later than) the StopTS is as- 
signed a final state of "unknown" in the transaction sta- 
tus table. This is done because as of the StopTS time, 
the final status of these transactions is unknown. 
[0167] The Undo Pass of the Updaters has already 
been described in detail, above, with reference to Figs. 
1 4A and 1 4B. When an Undo Pass is performed in Stop 
Updaters at Timestamp mode, instead of takeover 
mode, the backwards pass starts at the last audit record 
applied to the backup database (which is the record be- 
fore the EndTimePosition) instead of starting at the end 
of the image trail. Also, in Stop Updaters at Timestamp 
mode, at the end of the Undo Pass the Updater sets the 
TypeOfPass and EndTimePosition fields to prepare the 
Updater for starting at the proper position and in the 
proper mode when the system administrator restarts the 
Updaters. 

[0168] More specifically, the first time the Updater 
reads an audit image record having an RTD Timestamp 
at or after the Stop Timestamp, the following set of ac- 
tions are preformed: 

• the EndTimePosition in the Updater*s context 
record is set to first unprocessed audit record in the 
image trail; and 

• the TypeOfPass field in the Updaters context 
record is set to Undo. 

[01 69] Note that the Stop Updaters at Timestamp pro- 
cedure only causes the Updaters to stop, while the Ex- 
tractor and Receiver continue to process audit informa- 
tion. Also, the Updater saves information in its context 
record to guarantee that no image audit records will be 
missed by the Updaters as the result of a Stop Updaters 
at Timestamp operation. The Stop Updaters at Times- 



tamp operation leaves the Updaters ready to start 
processing all audit image records not applied to the 
backup database before the Updaters shut down. 

s Receiver and Updater Restart 

[0170] Whenever the Receiver process is restarted, 
such as after a system failure, and RDF shutdown or an 
RDF Takeover, the Receiver process is initialized. After 
a full RDF shutdown the Receiver process is always re- 
started before the Updaters are restarted. 
[01 71 ] Each time the Updaters in a backup system are 
restarted, such as after performing a Stop RDF, Stop 
Updaters at Timestamp, or an RDF Takeover, each Up- 
dater process is initialized and starts processing image 
trail records at the record indicated by the Redo Restart 
Position (571 , Fig. 9) in the Updated context record 
(790). 

[0172] If the StopUpdateToTime Completed flag is 
set, then the Updater suppresses the generation of error 
messages associated with redoing updates that may 
have already been applied to the backup database until 
it reads an audit record whose MAT position is greater 
than the EndTimePosition, at which point the generation 
of such errors is enabled. 

Purging Image Trail Files 

[0173] Generally, an image trail file can be purged (i. 
e., permanently deleted) when it is absolutely certain 
that the file contains no audit records that will ever be 
needed again, even if there is a primary system failure, 
backup system failure, or both. More specifically, an im- 
age trail must not be purged if it contains an audit record 
for any transaction that would not be completely com- 
mitted to storage on the backup system in the event of 
system failure. 

[0174] The purpose of the System Transaction List 
(SysTxList) shown in Fig. 7G, and stored in the header 
of each image trail file is to facilitate the process of de- 
termining which image trail files can be purged. This will 
now be explained in more detail. 
[0175] Periodically, such as evory 5 minutes, each 
Updater sends the Purger a "purge request" message 
that includes copy of the SysTxList at beginning of that 
image trial file that it is currently process (633, Fig. 10C). 
[0176] Referring to Fig. 15, the image trail purging 
procedure (800) is activated periodically (802), such as 
once every hour. Alternately, the time between succes- 
sive activations of the Purger may depend on the 
amount of audit information being received from the pri- 
mary system. For instance, the Purger might be activat- 
ed the earlier of (A) passage of N minutes, and (B) re- 
ceipt of M message buffers from the primary system. 
[0177] The Purger reads the last SysTxList sent by 
each Updater (804), selects the oldest one of those Sys- 
TxLisfs by determining which has the lowest transaction 
ID'S and uses the selected SysTxList as a Global Sys- 
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TxList (806). The Global SysTxList may have the same 
structure as the SysTxList shown in Fig. 7G, or the hioh 

Snt?" ?nT ,Q may be ° mitted ' because «he high 
Transaction ID for each CPU is not used 
[0178] Next, the Purger reads each image trail start- 
ing «h the oldest file (81 0), and purges all the West 

ttiatrmagetrailthatmeetpredefinedpurge criteria (808) 
The Purger is preferably configured to leave a prede- 
fined minimum number (RetainCount) of files in each im- 
age .nail. In other words, even if an image trail file would 
otherwise be eligible for purging, it is kept if there are 
RetainCount or fewer files left in the image trail (812) 

l a ,n ?T emb0^,imen, • tne RetainCount cannot be 
set to less than two. 

[0179] The Purger determines whether, for all CPUs 

I!? ■ rf "r" 0 " ' D Value in ,he S y sTxLi « in the ir^ 
age trail file header is less than the low transaction ID 

value ,nG.obalSysTxList(814)..fso,andtheimageIran 
contains at least the RetainCount number of files the 
We « purged (816), and then the next file in the image 
trail .s processed (818). If for any CPU the high trans- 
action ID value in the SysTxList in the image traiffife 
header is higher than or equal to the corresponding low 
transact™ ID in the Global SysTxList, the image trail 
file may contain audit records that would be needed in 
a Takeover or Stop Updaters at Timestamp operation, 
and therefore that image trail file cannot be deleted and 
processing of that image trail is complete 
[0180] Alternately, and this would take much more 
computational resources, at step 814 the audit records 

r^'wf ff fi ' e are inspected 10 see audit 
records ,n the file have a transaction ID that is less than 
he low transaction ID specified in the Global SysTxList 
or the corresponding primary system CPU. If the image 
trail file contams no such audit records, and the imaqe 
rail contains at least the RetainCount number J£ 

In! EffT'' I! 8 i$ de ' eted (816) ' 0therwis e. the file is 
not ^ deleted and processing of that image trail is com- 
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by transmission of a computer data signal (in which the 

ratTS U,eS ^ embedded) on a ^r wave. 
[0183] While the present invention has been de- 
scnbed with reference to a few specific embodiments 
the description is illustrative of the invention and is not 
to be construed as limiting the invention. Various modi- 
fications may occurto those skilled in the art withoutde- 

SpS h°T e t " Je Spirit and scope of me Mention as 
defined by the appended claims. 
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Claims 
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Alternate Embodiments 

[0181] The tasks performed by the Receiver, Updater 
f„ d tKr e ?T SSeSOfthepreferredemb °dirnentcan.' 
nother embodiments, be performed by processes per 
forming other tasks as well, or by a different set ofX- 

[0182] The present invention can be implemented as 
a computer program product that includes a computer 

stZrrf 3 "^ 6mbedded in acomputer liable 
storage medium. For instance, the computer program 

more n 7 n C ° main *" Pr ° 9ram ™ du,es *">™ or 
more of the Receiver, Updater and Purger processes. 
' nese program modules may be stored on a CD-ROM 

readable date or program storage product. The software 
modules in the computer program product may also be 
d-stnbuted electronically, via the Internet or otherwise 
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1. A method of operating a backup system so as to 
rep cate database updates performed on a primary 
system, the method comprising: 

receiving a stream of audit records from the pri- 
mary system, the audit records including audit 
update records indicating database updates 
generated by transactions executing on the pri- 
mary system; 

storing the audit update records in one or more 
image trails, and storing each image trail in a 
sequence of image trail files; 
storing in each image trail file a transaction ta- 
ble representing a range of transaction identifi- 
ers for transactions potentially pending in the 
pnmary system at the time that the first audit 
record in the image trail file was generated by 
the primary system; 

for each image trail, accessing and processing 
he audit records in the sequence of image trail 
files for that image trail; and 
periodically executing a file purge procedure for 
purging image trail files no longer needed in- 
cluding: 

identifying an oldest transaction table from 
among a set of transaction tables each of 
which comprises the transaction table in 
the last image trail file accessed for each 
of the image trails; 

accessing an image trail file for one of the 
image trails; 

comparing a first set of newest transaction 
identifiers in the transaction table in the ac- 
cessed image trail file with a second set of 
oldest transaction identifiers in the identi- 
fied oldest transaction table, and condition- 
ally purging the accessed image trail file 
when all of the transaction identifiers in the 
first setare olderthan corresponding trans- 
action identifiers in the second set. 

2. A method of operating a backup system so as to 
rep -cate database updates performed on a primary 
system, the method comprising- * 
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receiving a stream of audit records from the pri- 
mary system, the audit records including audit 
update records indicating database updates 
generated by transactions executing on the pri- 
mary system, transaction state records and 
time interval control records, at least a subset 
of the transaction state records indicating a 
commit/abort outcome for a specified transac- 
tion; 

each audit update record and transaction state 
record including a transaction identifier that 
identifies a corresponding transaction on the 
primary system; 

storing the audit update records in one or more 
image trails; 

inspecting the received transaction state 
records in a predefined chronological orderand 
generating a current transaction table repre- 
senting a range of transaction identifiers for 
transactions for which there is at least one 
transaction state record between successive 
ones of the time interval control records in the 
stream of audit records; 
saving the current transaction table as a previ- 
ous transaction table and generating a new cur- 
rent transaction table whenever a time interval 
control record is received; 
storing each of the image trails as a sequence 
of image trail files, including generating a new 
image trail file each time a previous image trail 
file reaches a predefined state, and storing in 
each new image trail file a copy of the previous 
transaction table at the time that the new image 
trail file is generated; 

for each image trail, accessing and processing 
the audit records in the sequence of image trail 
files for that image trail; and 
periodically executing a file purge procedure for 
purging image trail files no longer needed, in- 
cluding: 

identifying an oldest transaction table copy 
from among a set of transaction table cop- 
ies that comprises the transaction table 
copy in the last image trail file accessed for 
each of the image trails; 
accessing an image trail file for one of the 
image trails; ~ 
comparing a first set of newest transaction 
identifiers in the transaction table copy in 
the accessed image trail file with a second 
set of oldest transaction identifiers in the 
identified oldest transaction table copy, 
and conditionally purging the accessed im- 
age trail file when all of the transaction 
identifiers in the first set are older than cor- 
responding transaction identifiers in the 
second set. 



3. The method of claim 2, wherein the step of period- 
ically executing a file purge procedure includes: 

storing in a global transaction table information 
5 including the second set of oldest transaction 

identifiers in the identified oldest transaction ta- 
ble copy; 

for each image trail for which there are more 
than a predefined (RetainCount) number of im- 
io age trail files that have not been purged, per- 

forming the steps of accessing an image trail 
file, comparing the first and second sets of 
transaction identifiers, and conditionally purg- 
ing the accessed image trail file. 

15 

4. The method of claim 3, wherein the step of period- 
ically executing a file purge procedure includes: 

for each image trail for which there are more 
than a predefined (RetainCount) number of image 
20 trail files that have not been purged, 

accessing the image trail files for the image trail 
in chronological order, excluding the Retain- 
Count most recent image trail files; 
25 for each accessed image trail file comparing the 

first and second sets of transaction identifiers; 
and 

conditionally purging the accessed image trail 
file when all of the transaction identifiers in the 
30 first set are older than corresponding transac- 

tion identifiers in the second set. 

5. The method of claim 4, wherein the step of period- 
ically executing a file purge procedure includes: 

35 for each image trail for which there are more 

than a predefined (RetainCount) number of image 
trail files that have not been purged. 

accessing the image trail files for the image trail 
40 in chronological order, excluding the Retain- 

Count most recent image trail files; 
for each accessed image trail file comparing the 
first and second sets of transaction identifiers; 
conditionally purging the accessed image trail 
45 file when all of the transaction identifiers in the 

first set are older than corresponding transac- 
tion identifiers in the second set; and 
stopping the accessing of the image trail files 
for the image trail when any of the transaction 
so identifiers in the first set are not older than cor- 

responding transaction identifiers in the second 
set. 

6. A computer program product for use in conjunction 
55 with a backup computer system so as to replicate 

database updates performed on a primary system, 
the computer program product comprising a com- 
puter readable storage medium and a computer 



21 



41 



EP 1 093 055 A2 



42 



program mechanism embedded therein, the com- 
puter program mechanism comprising: 

a Receiver Module that receives and stores in 
one or more image trails a stream of audit s 
records received from the primary system, the 
audit records including audit update records in- 
dicating database updates generated by trans- 
actions executing on the primary system, and 
stores each image trail in a sequence of image io 
trail files; 

the Receiver Module storing in each image trail 
file a transaction table representing a range of 
transaction identifiers for transactions poten- 
tially pending in the primary system at the time 
that the first audit record in the image trail file 
was generated by the primary system; 
at least one Updater Module that sequentially 
applies to a backup database the database up- 
dates indicated by the audit update records, in 20 
the order the audit update records are stored in 
the image trails, and 

a file purge procedure for purging image trail 
files no longer needed, the file purge procedure 
including instructions for: 25 

identifying an oldest transaction table from 
among a set of transaction tables that com- 
prises the transaction table in the last im- 
age trail file accessed for each of the image 30 
trails; 

accessing an image trail file for one of the 
image trails; 

comparing a first set of newest transaction 
identifiers in the transaction table in the ac- 35 
cessed image trail file with a second set of 
oldest transaction identifiers in the identi- 
fied oldest transaction table, and condition- 
ally purging the accessed image trail file 
when all of the transaction identifiers in the *o 
first set are older than corresponding trans- 
action identifiers in the second set. 

A method of operating a backup system so as to 
replicate database updates performed on a primary « 
system, the method comprising: 

a Receiver Module that receives and stores in 
one or more image trails a stream of audit 
records received from the primary system, the so 
audit records including audit update records in- 
dicating database updates generated by trans- 
actions executing on the primary system, trans- 
action state records and time interval control 
records, at least a subset of the transaction ss 
state records indicating a commit/abort out- 
come for a specified transaction; 
each audit update record and transaction state 



record including a transaction identifier that 
identifies a corresponding transaction on the 
primary system; 

the Receiver Module storing the audit update 

records in one or more image trails; 

the Receiver Module includes instructions for: 

inspecting the received transaction state 
records in a predefined chronological order 
and generating a current transaction table 
representing a range cf transaction identi- 
fiers for transactions for which there is at 
least one transaction state record between 
successive ones of the time interval control 
records in the stream of audit records; 
saving the current transaction table as a 
previous transaction table and generating 
a new current transaction table whenever 
a time interval control record is received; 
storing each of the image trails as a se- 
quence of image trail files, including gen- 
erating a new image trail file each time a 
previous image trail file reaches a prede- 
fined state, and storing in each new image 
trail file a copy of the previous transaction 
table at the time that the new image trail file 
is generated; 

at least one Updater Module that sequentially 
applies to a backup database the database up- 
dates indicated by the audit update records, in 
the order the audit update records are stored in 
the image trails; and 

a file purge procedure for purging image trail 
files no longer needed, the file purge procedure 
including instructions for: 

identifying an oldest transaction table copy 
from among a set of transaction table cop- 
ies, each of which comprises the transac- 
tion table copy in the last image trail file ac- 
cessed for each of the image trails; 
accessing an image trail file for one of the 
image trails; 

comparing a first set of newest transaction 
identifiers in the transaction table copy in 
the accessed image trail file with a second 
set of oldest transaction identifiers in the 
identified oldest transaction table copy, 
and conditionally purging the accessed im- 
age trail file when all of the transaction 
identifiers in the first set are older than cor- 
responding transaction identifiers in the 
second set. 

8. The computer program product of claim 7, wherein 
the file purge procedure includes instructions for: 
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storing in a global transaction table information 
including the second set of oldest transaction 
identifiers in the identified oldest transaction ta- 
ble copy; 

for each image trail for which there are more 
than a predefined (RetainCount) number of im- 
age trail files that have not been purged, per- 
forming the steps of accessing an image trail 
file, comparing the first and second sets of 
transaction identifiers, and conditionally purg- 
ing the accessed image trail file. 

9. The computer program product of claim 8, wherein 
the file purge procedure includes instructions for: 

for each image trail for which there are more 
than a predefined (RetainCount) number of image 
trail files that have not been purged, 

accessing the image trail files for the image trail 
in chronological order, excluding the Retain- 
Count most recent image trail files; 
for each accessed image trail file comparing the 
first and second sets of transaction identifiers; 
and 

conditionally purging the accessed image trail 
file when all of the transaction identifiers in the 
first set are older than corresponding transac- 
tion identifiers in the second set. 

10. The computer program product of claim 9, wherein 
the file purge procedure includes instructions for: 

for each image trail for which there are more 
than a predefined (RetainCount) number of image 
trail files that have not been purged, 

accessing the image trail files for the image trail 
in chronological order, excluding the Retain- 
Count most recent image trail files; 
for each accessed image trail file comparing the 
first and second sets of transaction identifiers; 
conditionally purging the accessed image trail 
file when all of the transaction identifiers in the 
first set are older than corresponding transac- 
tion identifiers in the second set; and 
stopping the accessing of the image trail files 
for the image trail when any of the transaction 
identifiers in the first set are not older than cor- 
responding transaction identifiers in the second 
set. 

11. A backup computer system for replicating database 
updates performed on a primary system, compris- 
ing: 

a backup database; 

a Receiver Module for processing a stream of 
audit records received from the primary sys- 
tem, the audit records including audit update 



records indicating database updates generated 
by transactions executing on the primary sys- 
tem, and stores each image trail in a sequence 
of image trail files; 

the Receiver Module storing in each image trail 
file a transaction table representing a range of 
transaction identifiers for transactions poten- 
tially pending in the primary system at the time 
that the first audit record in the image trail file 
was generated by the primary system; 
at least one Updater Module that sequentially 
applies to a backup database the database up- 
dates indicated by the audit update records, in 
the order the audit update records are stored in 
the image trails; and 

a file purge procedure for purging image trail 
files no longer needed, the file purge procedure 
including instructions for: 

identifying an oldest transaction table from 
among a set of transaction tables, each of 
which comprises the transaction table in 
the last image trail file accessed for each 
of the image trails; 

accessing an image trail file for one of the 
image trails; 

comparing a first set of newest transaction 
identifiers in the transaction table in the ac- 
cessed image trail file with a second set of 
oldest transaction identifiers in the identi- 
fied oldest transaction table, and condition- 
ally purging the accessed image trail file 
when all of the transaction identifiers in the 
first set are older than corresponding trans- 
action identifiers in the second set. 

1 2. A backup computer system for replicating database 
updates performed on a primary system, compris- 
ing: 

a backup database; 

a Receiver Module that receives and stores in 
one or more image trails a stream of audit 
records received from the primary system, the 
audit records including audit update records in- 
dicating database updates generated by trans- 
actions executing on the primary system, trans- 
action state records and time interval control 
records, at least a subset of the transaction 
state records indicating a commit/abort out- 
come for a specified transaction; 
each audit update record and transaction state 
record including a transaction identifier that 
identifies a corresponding transaction on the 
primary system; 

the Receiver Module storing the audit update 

records in one or more image trails; 

the Receiver Module includes instructions for: 
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inspecting the received transaction state 
records in a predefined chronological order 
and generating a current transaction table 
representing a range of transaction identi- 
fiers for transactions for which there is at 5 
least one transaction state record between 
successive ones of the time interval control 
records in the stream of audit records; 
saving the current transaction table as a 
previous transaction table and generating 10 
a new current transaction table whenever 
a time interval control record is received; 
storing each of the image trails as a se- 
quence of image trail files, including gen- 
erating a new image trail file each time a 
previous image trail file reaches a prede- 
fined state, and storing in each new image 
trail file a copy of the previous transaction 
table at the time that the new image trail file 
is generated; 20 



at least one Updater Module that sequentially 
applies to a backup database the database up- 
dates indicated by the audit update records, in 
the order the audit update records are stored in 
the image trails; and 

a file purge procedure for purging image trail 
files no longer needed, the file purge procedure 
including instructions for: 

identifying an oldest transaction table copy 
from among a set of transaction table cop- 
ies, each of which comprises the transac- 
tion table copy in the last image trail file ac- 
cessed for each of the image trails; 
accessing an image trail file for one of the 
image trails; 

comparing a first set of newest transaction 
identifiers in the transaction table copy in 
the accessed image trail file with a second 
set of oldest transaction identifiers in the 
identified oldest transaction table copy, 
and conditionally purging the accessed im- 
age trail file when all of the transaction 
identifiers in the first set are older than cor- 
responding transaction identifiers in the 
second set. 

13. The backup computer system of claim 12, wherein 
the file purge procedure includes instructions for: 

storing in a global transaction table information 
including the second set of oldest transaction 
identifiers in the identified oldest transaction ta- 
ble copy; 

for each image trail for which there are more 
than a predefined (RetainCount) number of im- 
age trail files that have not been purged, per- 



forming the steps of accessing an image trail 
file, comparing the first and second sets of 
transaction identifiers, and conditionally purg- 
ing the accessed image trail file. 

14. The backup computer system of claim 13, wherein 
the file purge procedure includes instructions for: 

for each image trail for which. there are more 
than a predefined (RetainCount) number of image 
trail files that have not been purged, 

accessing the image trail files for the image trail 
in chronological order, excluding the Retain- 
Count most recent image trail files; 
for each accessed image trail file comparing the 
first and second sets of transaction identifiers; 
and 

conditionally purging the accessed image trail 
file when all of the transaction identifiers in the 
first set are older than corresponding transac- 
tion identifiers in the second set. 

15. The backup computer system of claim 14, wherein 
the file purge procedure includes instructions for: 

25 for each image trail for which there are more 

than a predefined (RetainCount) number of image 
trail files that have not been purged, 

accessing the image trail files for the image trail 
in chronological order, excluding the Retain- 
Count most recent image trail files; 
for each accessed image trail file comparing the 
first and second sets of transaction identifiers; 
conditionally purging the accessed image trail 
file when all of the transaction identifiers in the 
first set are older than corresponding transac- 
tion identifiers in the second set; and 
stopping the accessing of the image trail files 
for the image trail when any of the transaction 
identifiers in the first set are not older than cor- 
responding transaction identifiers in the second 
set. 
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Store transaction final state information in TST 
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traverse of other imaqe trails. 
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unknown" for such transactions 
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Updater Undo Pass 
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Send request to Purger for permission to perform Undo Pass. 
Wait until permission message is received. 
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End Undo Pass 
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Store Undo List entries in. a local transaction 
status table (local 1ST). 
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Undo all updates associated with incomplete 
transactions: See Fig. 14B 
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If in takeover mode { 

Set Takeover_Completed flag } 
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If in Stop Updaters at Timestamp mode { 

Set TypeOfPass to Redo 

Set StartTimePosition to last IT record 
processed by Undo Pass. } 
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Durably store Updater context record. 
Terminate BackupJUpdater process. 
Terminate Updater process. 



FIG. 14A 
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Commit current Updater transaction. 
Undo Postion = current IT position. 
Save Updater context record. 
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Reset and start tx timer. 
Start new Updater transaction 



Read next earlier audit record in image trail. 
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End Undo Pass: 
Commit Updater 
Transaction 



Backout the update from 
the backup database 



FIG. 14B 
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Periodically activate Image Trail purging procedure 
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Read last SysTxList sent by each Updater 
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Global SysTxList = oldest SysTxList fromUpdaters 
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Repeat for each Image Trail (including MIT): 



Start at oldest IT file 
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Done 
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